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NEUROGENIC DIFFERENTIATION (NeuroD) GENES AND PROTEINS 

This invention was made with government support under grant CA42506 
awarded by the National Institutes of Health. The government has certain rights in 
the invention. 

5 This application is a continuation-in-part of co-pending U.S. application 

No. 08/552,142, filed November 2, 1995, which is a continuation-in-part of PCT 
application No. PCT/US95/05741, which is a continuation-in-part of parent 
application U.S. Serial No. 08/239,238, filed May 6, 1994 (abandoned). 

Field of the Invention 

10 The invention relates to molecular biology and in particular to genes and 

proteins involved in vertebrate neural development. 

Background of the Invention 
Transcription factors of the basic-helbc-loop-helix (bHLH) family are 

implicated in the regulation of differentiation in a wide variety of cell types, including 
15 trophoblast cells (Cross et al. Development 121:2513-2523, 1995), pigment cells 

(Steingrimsson et al., Nature Gen. 8:251-255, 1994), B-cells (Shen, CP. and T. 

Kadesch., Molec. & Cell BioL 15:3813-3822, 1995; Zhuang et al.. Cell 79:875-884, 

1994), chondrocytes and osteoblasts (Cseijesi et al., Development 121:1099-1110, 

1995; Tamura, M. and M. Noda., 1 Cell BioL 126:773-782, 1994), and cardiac 
20 muscle (Burgess et al.. Develop. Biol. 168:296-306, 1995; HoUenberg et al., Molec. 

& Cell. Biol. 15:3813-3822, 1995). bHLH proteins form homodimeric and 
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heterodimeric complexes that bind with DNA in the 5' regulatory regions of genes 
controlling expression 

Perhaps the most extensively studied sub-families of bHLH proteins are those 
that regulate myogenesis and neurogenesis The myogenic bHLH faaors, (MyoD, 
5 myogenin, Myf5, and MRF4), appear to have unique as well as redundant functions 
during myogenesis (Weintraub, H., Cell 75:1241-1244, 1993; Weintraub et al., 
Science 251:761-766, 1991). It is thought that either MyfS or MyoD is necessary to 
determine myogenic fate, whereas myogenin is necessary for events involved in 
terminal differentiation (Hasty et al.. Nature 364:501-506, 1993; Nabeshima et al., 

10 //aft/re 364:532-535, 1993; Rudnicki etal , Ce// 75: 1351-1359. 1993, Venuti et al., 7. 
Cell Biol. 128:563-576, 1995). Recent work on neurogenic bHLH proteins suggests 
parallels between the myogenic and neurogenic sub-families of bHLH proteins. Genes 
of the Drosophila melanogaster achaeie-scuie complex and the atonal gene have 
been shown to be involved in neural cell fate determination (Anderson, D. J., Cur. 

15 Biol. 5:1235-1238, 1995; Campuzano, S. and J. Modolell., Trends in Genetics 
8:202-208, 1992; Jaman et al., Cell 73:1307-1321, 1993), and the mammalian 
homologs, MASHl and MATHl, are expressed in the neural tube at the time of 
neurogenesis (Akazawa et al., J. Biol Chem. 270:8730-8738, 1995; Lo et al., Genes 
& Dev. 5:1524-1537, 1991). Two related vertebrate bHLH proteins, neuroD 

20 (hereafter referred to as "neuroDl ") and NEX-l/MATH-2, are expressed slightly later 
in CNS development, predominantly in the marginal layer of the neural tube and 
persisting in the mature nervous system (Bartholoma, A. and K. A. Nave.. Meek Dev. 
48:217-228, 1994; Lee et al., 5c/ence 268:836-844. 1995; Shimizu et al.. Eun J. 
Biochem. 229:239-248, 1995). NeuroDl was also cloned as a factor that regulates 

25 insulin transcription in pancreatic beta cells and named "Beta2" (Naya et al., Genes & 
Dev. 9:1009-1019, 1995). Constitutive expression of neuroDl in developing 
Xenopus embryos produces ectopic neurogenesis in the ectodermal cells, indicating 
that neuroD is capable of regulating a neurogenic program. A neuroDl homolog 
having 36,873 nucleotides has been identified in C elegans (Lee et al.. 1995; 

30 Genbank Accession No. 010402), suggesting that this molecular mechanism of 
regulating neurogenesis may be conserved between vertebrates and invertebrates. 

Neural tissues and endocrine tissues do not regenerate. Damage is permanent. 
Paralysis, loss of vision or hearing, and hormonal insuflBciency are also intractable 
medical conditions. Furthermore, tumors in neural and endocrine tissues can be very 

35 difficult to treat because of the toxic side effects that conventional chemotherapeutic 
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daigs may have on nen/ous tissues. The medical community and pubhc would greatly 
benefit from the availability of agents active in triggering differentiation in 
neuroectodermal stem cells. Such neuronal differentiating agents could be used for 
construction of test cell lines, assays for identifying candidate therapeutic agents 
5 capable of inducing regeneration of neuronal and endocrine tissues, gene therapy, and 
differentiation of tumor cells. 

Summary of the Invention 
The presently disclosed neuroD proteins represent a new sub-family of bHLH 
proteins and are implicated in vertebrate ' neuronal, endocrine and gastrointestinal 

10 development. Mammalian and amphibian neuroD proteins were identified, and 
polynucleotide molecules encoding neuroD proteins were isolated and sequenced. 
NeuroD genes encode proteins that are distinctive members of the bHLH family. In 
addition, the present invention provides a family of neuroD proteins that share a 
highly conserved HLH region. Representative polynucleotide molecules encoding 

1 5 members of the neuroD family include neuroD J, neuroDl and neuroDS. 

A representative nucleotide sequence encoding murine neuroD 1 is shown in 
SEQ ID NO: I. The HLH coding domain of murine neuroD 1 resides between 
nucleotides 577 and 696 in SEQ ID N0:1. The deduced amino acid sequence of 
murine neuroD 1 is shown in SEQ ID N0:2. There is a highly conserved region 

20 following the heHx-2 domain from amino acid 150 through amino acid 199 of SEQ ID 
NO: 2 that is not shared by other bHLH proteins, 

A representative nucleotide sequence encoding Xenopus neuroD 1 is shown in 
SEQ ED N0:3. The HLH coding domain of Xenopus neuroDl resides between 
nucleotides 376 and 495 in SEQ ID N0:3. The deduced amino acid sequence of 

25 Xenopus neuroDl is shown in SEQ ID N0:4. There is a highly conserved region 
following the helix-2 domain from amino acid 157 through amino acid 199 of SEQ ID 
N0:4 that is not shared by other bHLH proteins. 

Representative nucleotide and deduced amino acid sequences of the human 
neuroD family are shown in SEQ ID NOS:8-15. Representative nucleotide and 

30 deduced amino acid sequences of a human homolog of murine neuroDl are shown in 
SEQ ID NOS:8 and 9 (partial genomic sequence) and SEQ ID N0S:14 and 15 
(human neuroDl cDNA). Representative nucleotide and deduced amino acid 
sequences of the human and murine neuroDl are shown in SEQ ID NOS:10 and 11, 
and 16 and 17, respectively. Representative nucleotide and deduced amino acid 

35 sequences for human neuroD 3 are shown in SEQ ID N0S:I2 and 13. The disclosed 
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human clones, 9Fl(and its corresponding cDNA HC2A, now referred to as human 
neuroDJ) and 14B1 (now referred to as human neuroD2\ have an identical HLH 
motif: Amino acid residues 117-156 in SEQ ID N0:9 and 15, and residues 137-176 
in SEQ ID N0;1 1 (corresponding to nucleotides 405-524 of SEQ ID N0:8 and SEQ 
5 ID N0:14, and nucleotides 463-582 of SEQ ID NO:10). Comparison of the deduced 
amino acid sequences of these neuroD genes shows that human neuroD3 contains an 
HLH domain between amino acid residues 108-147 of SEQ ID NO: 13 (corresponding 
to nucleotides 376-495 of SEQ ID NO: 12) and that murine neuroD2 contains an 
HLH domain between amino acids residues 138-177 of SEQ ID NO: 17 

10 (corresponding to nucleotides 641-760 of SEQ ID NO: 16) The HLH domain of 
murine neuroD2 is identical to that of the human neuroDl and human neuroD2 
proteins. Similar analyses indicated that mouse neuroD3 contains an HLH domain 
between amino acid residues 109-148 of SEQ ID NO:22 (corresponding to 
nucleotides 425-544 of SEQ ID N0:21) 

15 Brief Description of the Drawings 

FIGURE 1 schematically depicts the domain structure of the murine and 
Xenopus neuroD bHLH proteins. 

Detailed Description of the Preferred Embodiment 
Tissue-specific bHLH proteins that regulate early neuroectodermal 

20 differentiation were discovered using expression cloning and screening assays 
designed to identify possible bHLH proteins capable of interacting with the protein 
product of the Drosophila daughterless gene. These proteins belong to a family of 
proteins that share conserved residues in the HLH region. The subject invention 
provides neuroDl and neuroDS, which are two novel genes related to neuroDl, and 

25 which have been isolated and whose nucleotide sequences have been determined. The 
term "neuroD," as used here, encompasses all members of the neuroD family, and 
includes neuroDl, neuroDl and neuroDS coding sequences and proteins. 

The neuroD family of genes function during the development of the nervous 
system. Like MATHl (Lo et al.. Genes & Dev. 5:1524-1537, 1991), the expression 

30 of neuroD3 peaks during embryonic development and is not detected in the mature 
nervous system. NeuroDl shows a high degree of sequence similarity to both 
neuroDl and NEX-l/MATHl, and is similarly expressed both during embryogenesis 
and in the mature nervous system, demonstrating an expression pattern that partially 
overlaps with neuroDl. Like neuroDl, neuroDl when expressed by transfection in 

35 Xenopus embryos induces neurogenesis in ectodermal cells. Transfection of 
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expression vectors for neuroDJ and neuroDl indicates that these highly similar 
transcription factors demonstrate some target specificity, with the GAP'43 promoter 
being activated by neuroDI and not by muroDL The partially overlapping 
expression pattern and target specificity of neuroDl and neuroD2 suggests that this 
5 group of neurogenic transcription factors may contribute to the establishment of 
neuronal identity in the nervous system by acting on an overlapping but non- 
congruent set of target genes. 

NeuroD proteins are transiently expressed in differentiating neurons during 
embryogenesis. NeuroD proteins are also detected in adult brain, in the granule layer 

10 of the hippocampus and the cerebellum. In addition, murine muroDI expression has 
been detected in the pancreas and gastrointestinal tissues of developing embryos and 
post-natal mice (see, e.g., Example 14). 

NeuroD proteins contain the basic helix-loop-helix (bHLH) domain structure 
that has been implicated in the binding of bHLH proteins to upstream recognition 

15 sequences and activation of downstream target genes. The present invention provides 
representative neuroD proteins, which include the murine neuroDl protein of SEQ ID 
N0:2, the amphibian neuroDl protein of SEQ ID N0:4, murine neuroD2 protein of 
SEQ ID NO: 17, human neuroDl protein of SEQ ID N0S:9 and 15, human neuroD2 
protein of SEQ ID N0:1 1, human neuroDS protein of SEQ ID N0:13, and mouse 

20 neuroD3 protein SEQ ID NO:22. Based on homology with other bHLH proteins, the 
bHLH domain for murine neuroDl is predicted to reside between amino acids 102 
and 155 of SEQ ID N0;2, and between amino acids 101 and 157 of SEQ ID N0:4 
for the amphibian neuroDl. 

As detailed below, the present invention provides the identification of human 

25 neuroDi and, in addition, provides unexpected homologous genes of the same family 
based on highly conserved sequences across the HLH domain shared between the two 
human genes at the amino acid level (neuroD2 and neuroD3\ SEQ ID NOS:10 and 
11, and 12 and 13, respectively). 

NeuroD proteins are transcriptional activators that control transcription of 

30 downstream target genes including genes that among other activities cause neuronal 
progenitors to differentiate into mature neurons. In the neurula stage of the mouse 
embryo (elO), murine neuroDl is highly expressed in the neurogenic derivatives of 
neural crest cells, the cranial and dorsal root ganglia, and postmitotic cells in the 
central nervous system (CNS). During mouse development, neuroDl is expressed 

35 transiently and concomitant with neuronal differentiation in differentiating neurons in 
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sensory organs such as in nasal epithelium and retina. In Xenopus embryos, ectopic 
expression of neuroDl in non-neuronal cells induced formation of neurons. As 
discussed in more detail below, neuroD proteins are expressed in differentiating 
neurons and are capable of causing the conversion of non-neuronal cells into neurons. 
5 The present invention encompasses variants of neuroD genes that, for example, are 
modified in a manner that results in a neuroD protein capable of binding to its 
recognition site, but unable to activate downstream genes. The present invention also 
encompasses fragments of neuroD proteins that, for example, are capable of binding 
the natural neuroD partner, but that are incapable of activating downstream genes. 

10 NeuroD proteins encompass proteins retrieved from naturally occurring materials and 
closely related, functionally similar proteins retrieved by antisera specific to neuroD 
proteins, and recombinantly expressed proteins encoded by genetic materials (DNA, 
RNA, cDNA) retrieved on the basis of their similarity to the unique regions in the 
neuroD family of genes. 

15 The present invention provides representative isolated and purified 

polynucleotide molecules encoding proteins of the neuroD family. Representative 
polynucleotide molecules encoding various neuroD proteins include the sequences 
presented in SEQ ID N0S:1, 3, 8, 10, 12, 14, and 16. Polynucleotide molecules 
encoding neuroD include those sequences resulting in minor genetic polymorphisms, 

20 differences between species, and those that contain amino acid substitutions, 
additions, and/or deletions. According to the present invention, polynucleotide 
molecules encoding neuroD proteins encompass those molecules that encode neuroD 
proteins or peptides that share identity with the sequences shown in SEQ ID N0S:2, 
4, 9, 11, 13, 15, and 17. Such molecules will generally share greater than 35% 

25 identity at the amino acid level with the disclosed sequences. The neuroD genes of 
the present invention may share greater identity at the amino acid level across highly 
conserved regions such as the HLH domain. For example, the deduced amino acid 
sequences of murine and Xenopus neuroDl genes are 96% identical within this 
domain, 

30 In some instances, one may employ such changes in the sequence of a 

recombinant neuroD polynucleotide molecule to substantially decrease or even 
increase the biological activity of neuroD protein relative to the wild-type neuroD 
activity, depending on the intended use of the preparation. Such changes may also be 
directed towards endogenous neuroD polynucleotide sequences using, for example. 



i 
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gene therapy methods to alter the gene product. Such changes are envisioned with 
regard to neuroDl, neuroD2, neuroDS^ or other members of the neuroD gene family. 

The neuroDl proteins of the present invention are capable of inducing the 
expression in a frog embryo of neuron-specific genes, such as N-CAM, p-tubulin. and 
5 Xen-1, neurofilament M (NF-M), Xen-2, tanabin-1, shaker- 1, and fi-og HSCL. As 
described below in Example 10, neuroDl activity may be detected when neuroD is 
ectopically expressed in frog oocytes following, for example, injection of Xenopus 
neuroDl RNA into one of the two cells in a two-cell stage Xenopus embryo, and 
monitoring expression of neuronal-specific genes in the injected as compared to 

1 0 uninjected side of the embryo by immunochemistry or in situ hybridization. 

"Over-expression" means an increased level of a neuroD protein or o^ neuroD 
transcripts in a recombinant transformed host cell relative to the level of protein or 
transcripts in the parental cell from which the host cell is derived. 

As noted above, the present invention provides isolated and purified 

15 polynucleotide molecules encoding various members of the neuroD family. The 
disclosed sequences may be used to identify and isolate additional neuroD 
polynucleotide molecules from suitable mammalian or non-mammalian host cells such 
as canine, ovine, bovine, caprine, lagomorph, or avian. In particular, the nucleotide 
sequences encoding the HLH region may be used to identify polynucleotide molecules 

20 encoding other proteins of the neuroD family. Complementary DNA molecules 
encoding neuroD family members may be obtained by constructing a cDNA library 
mRNA from, for example, fetal brain, newborn brain, and adult brain tissues. DNA 
molecules encoding neuroD family members may be isolated from such a library using 
the disclosed sequences to provide probes to be used in standard hybridization 

25 methods (e.g., Sambrook et al.. Molecular Cloning: A Laboratory Manual, Second 
Edition, Cold Spring Harbor, NY. 1989, which is incorporated herein by reference), 
and Bothwell, Yancopoulos and Alt, ibid.) or by amplification of sequences using 
polymerase chain reaction (PCR) amplification (e.g., Loh et al., Science 243:217-222, 
1989; Frohman et al., Proc. Natl Acad. ScL USA 85:8998-9002, 1988; Erlich (ed.), 

30 PCR Technology: Principles and Applications for DNA Amplification, Stockton 
Press, 1989; and Mullis et al., PCR: The Polymerase Chain Reaction, 1994, which 
are incorporated by reference herein in their entirety). In a similar manner, genomic 
DNA encoding neuroD proteins may be obtained using probes designed from the 
sequences disclosed herein. Suitable probes for use in identifying neuroD genes or 

35 transcripts may be obtained from wewroD-specific sequences that are highly conserved 
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regions between mammalian and amphibian neuroD coding sequences. Nucleotide 
sequences, for example, from the region encoding the approximately 40 residues 
following the helix-2 domain are suitable for use in designing PGR primers. 
Alternatively, oligonucleotides containing specific DNA sequences from a human 
5 neuroDl, neuroD2, or neuroDJ coding region may be used within the described 
methods to identify related human neuroD genomic and cDNA clones. Upstream 
regulatory regions of neuroD may be obtained using the same methods. Suitable PGR 
primers are between 7-50 nucleotides in length, more preferably between 15 and 25 
nucleotides in length. Alternatively, neuroD polynucleotide molecules may be 

10 isolated using standard hybridization techniques with probes of at least about 15 
nucleotides in length and up to and including the full coding sequence. Southern 
analysis of mouse genomic DNA probed with the murine neuroDI cDNA under 
stringent conditions showed the presence of only one gene, suggesting that under 
stringent conditions bHLH genes from other protein families will not be identified. 

15 Other members of the neuroD family can be identified using degenerate 
oligonucleotides based on the sequences disclosed herein for PGR amphfication or by 
hybridization at moderate stringency using probes based on the disclosed sequences. 

The regulatory regions neuroD may be useful as tissue-specific promoters. 
Such regulatory regions may find use in, for example, gene therapy to drive the tissue- 

20 specific expression of heterologous genes in pancreatic, gastrointestinal, or neural 
cells, tissues or cell lines. As shown in Example 14, murine neuroD 1 promoter 
sequences reside within the 1.4 kb 5' untranslated region. Regulatory sequences 
within this region are identified by comparison to other promoter sequences and/or 
deletion analysis of the region itself. 

25 In other aspects of the invention, a DNA molecule coding a neuroD protein is 

inserted into a suitable expression vector, which is in turn used to transfect or 
transform a suitable host cell Suitable expression vectors for use in carrying out the 
present invention include a promoter capable of directing the transcription of a 
polynucleotide molecule of interest in a host cell and may also include a transcription 

30 termination signal, these elements being operably linked in the vector. Representative 
expression vectors may include both plasmid and/or viral vector sequences. Suitable 
vectors include retroviral vectors, vaccinia viral vectors. CMV viral vectors, 
BLUESGRDPT® vectors, baculovirus vectors, and the Hke. Promoters capable of 
directing the transcription of a cloned gene or cDNA may be inducible or constitutive 

35 promoters and include viral and cellular promoters For expression in mammalian 
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host cells, suitable viral promoters include the immediate early cytomegalovirus 
promoter (Boshart et al., Cell 41:521-530, 1985) and the SV40 promoter (Subramani 
et al., Mol Cell. Biol. 1:854-864, 1981). Suitable cellular promoters for expression 
of proteins in mammalian host cells include the mouse metallothionine-1 promoter 
5 (Palmiter et al., U.S. Patent No. 4,579,821), a mouse Vk promoter (Bergman et al., 
Proc. Nail Acad. Sci. USA 81:7041-7045, 1983; Grant et al. Nucleic Acid Res. 
15:5496, 1987), and tetracycline-responsive promoter (Gossen and Bujard, Proc. 
Natl. Acad. Sci. USA 89:5547-5551, 1992, and Pescini et al.. Bjochem, Biophys. Res. 
Comm. 202:1664-1667, 1994). Also contained in the expression vectors, typically, is 

10 a transcription termination signal located downstream of the coding sequence of 
interest. Suitable transcription termination signals include the early or late 
polyadenylation signals from SV40 (Kaufman and Sharp, Mol. Cell Biol. 2:1304- 
1319, 1982), the polyadenylation signal from the Adenovirus 5 elB region, and the 
human growth hormone gene terminator (DeNoto et al., Nucleic Acid. Res. 9:3719- 

15 3730, 1981). Mammalian cells, for example, may be transfected by a number of 
methods including calcium phosphate precipitation (Wigler et al.. Cell 14:725, 1978; 
Corsaro and Pearson, Somatic Cell Genetics 7:603, 1981; Graham and Van der Eb, 
Virology 52:456, 1973), lipofection, microinjection, and electroporation (Neumann et 
al., EMBO J. 1:8410845, 1982). Mammalian cells can be transduced with viruses 

20 such as SV40, CMV, and the like. In the case of viral vectors, cloned DNA 
molecules may be introduced by infection of susceptible cells with viral particles. 
Retroviral vectors may be preferred for use in expressing neuroD proteins in 
mammalian cells particularly if the neuroD genes used for gene therapy (for review, 
see. Miller et al. Methods in Enzymology 217:581-599, 1994; which is incorporated 

25 herein by reference in its entirety). It may be preferable to use a selectable marker to 
identify cells that contain the cloned DNA, Selectable markers are generally 
introduced into the cells along with the cloned DNA molecules and include genes that 
confer resistance to drugs, such as neomycin, hygromycin, and methotrexate. 
Selectable markers may also complement auxotrophs in the host cell. Yet other 

30 selectable markers provide detectable signals, such as P-galactosidase to identify cells 
containing the cloned DNA molecules. Selectable markers may be amplifiable. Such 
amplifiable selectable markers may be used to amplify the number of sequences 
integrated into the host genome. 

As would be evident to one of ordinary skill in the art, the polynucleotide 

35 molecules of the present invention may be expressed in Saccharomyces cerevisiae. 
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filamentous fungi, and £". coU. Methods for expressing cloned genes in 
Saccharomyces cerevisiae are generally known in the art (see, "Gene Expression 
Technology," Methods in Enzymology, Vol. 185, Goeddel (ed ), Academic Press, San 
Diego, CA, 1990; and "Guide to Yeast Genetics and Molecular Biology," Methods in 
5 Enzymology, Guthrie and Fink (eds ), Academic Press, San Diego, CA, 1991, which 
are incorporated herein by reference). Filamentous fungi may also be used to express 
the proteins of the present invention; for example, strains of the fungi Aspergillus 
(McKnight et al., U.S. Patent No. 4,935,349, which is incorporated herein by 
reference). Methods for expressing genes and cDNAs in cultured mammalian ceUs 

10 and in E. coli are discussed in detail in Sambrook et al., 1989. As will be evident to 
one skilled in the art, one can express the protein of the instant invention in other host 
cells such as avian, insect, and plant cells using regulatory sequences, vectors and 
methods well established in the literature. 

NeuroD proteins produced according to the present invention may be purified 

15 using a number of established methods such as aflBnity chromatography using anti- 
neuroD antibodies coupled to a solid support. Fusion proteins of antigenic tag and 
neuroD can be purified using antibodies to the tag. Additional purification may be 
achieved using conventional purification means such as liquid chromatography, 
gradient centrifiigation, and gel electrophoresis, among others. Methods of protein 

20 purification are knov^ in the art (see generally, Scopes, R., Protein Purification, 
Springer- Veriag, NY, 1982, which is incorporated herein by reference) and may be 
applied to the purification of recombinant neuroD described herein. 

The term "capable of hybridizing under stringent conditions" as used herein 
means that the subject nucleic acid molecules (whether DNA or RNA) anneal under 

25 stringent hybridization conditions to an oligonucleotide of 15 or more contiguous 
nucleotides of SEQ ID N0:1, SEQ ID N0:3, SEQ ID N0:8, SEQ ID NO:10, SEQ 
ID NO: 12, SEQ ID NO: 14 or SEQ ID NO: 16. It is generally known that 
oligonucleotides 15 nucleotides or more in length are extremely unlikely to be 
represented more than once in a manunalian genome, hence such oligonucleotides can 

30 form specific hybrids (see, for example, Sambrook et al., Molecular Cloning, [2d ed.], 
Cold Spring Harbor Laboratory Press, 1989, at Section 1 1.7). 

*'Stringent hybridization" is generally understood in the art to mean that the 
nucleic acid duplexes that form during the hybridization reaction are perfectly 
matched or nearly perfectly matched. Several rules governing nucleic acid 

35 hybridization have been well established For example, it is standard practice to 
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achieve stringent hybridization for polynucleotide molecules >200 nucleotides in 
length by hybridizing at a temperature 15°-25°C below the mehing temperature (Tm) 
of the expected duplex, and S^-IO^'C below the Tm for oligonucleotide probes (e.g., 
Sambrook et al., at Section 1 1.45). 
5 The Tm of a nucleic acid duplex be calculated using a formula based on the % 

G+C contained in the nucleic acids, and that takes chain length into account, such as 
the formula Tm = 81.5 -16.6 (log [Na*^]) -f 0.41 (% G+C) - (600/N), where 
N = chain length (Sambrook et a!., 1989, at Section 1 1.46). It is apparent from this 
formula that the effects of chain length on Tm is significant only when rather short 

10 nucleic acids are hybridized, and also that the length effect is negligible for nucleic 
acids longer than a few hundred bases. 

The choice of hybridization conditions will be evident to one skilled in the art 
and will generally be guided by the purpose of the hybridization, the type of 
hybridization (DNA-DNA or DNA-RNA), and the level of desired relatedness 

15 between the sequences. As discussed above, methods for hybridization are well 
established in the literature. See also, for example: Sambrook et al., ibid.; Hames and 
Higgins, eds., Nucleic Acid Hybridization, A Practical Approach, IRL Press, 
Washington DC, 1985; Berger and Kimmel, eds., Methods in Enzymology, Vol. 52, 
Guide to Molecular Cloning Techniques, Academic Press Inc., New York, NY, 1987; 

20 and Bothwell, Yancopoulos and Alt, eds., Methods for Cloning and Analysis of 
Eukaryotic Genes, Jones and Bartlett Publishers, Boston, MA 1990; which are 
incorporated by reference herein in their entirety. One of ordinary skill in the art 
realizes that the stability of nucleic acid duplexes will decrease with an increased 
number and location of mismatched bases; thus, the stringency of hybridization may 

25 be used to maximize or minimize the stability of such duplexes. Hybridization 
stringency can be altered by: adjusting the temperature of hybridization; adjusting the 
percentage of helix-destabilizing agents, such as formamide, in the hybridization mix; 
and adjusting the temperature and/or salt concentration of the wash solutions. In 
general, the stringency of hybridization is adjusted during the post-hybridization 

30 washes by varying the salt concentration and/or the temperature. Stringency of 
hybridization may be reduced by reducing the percentage of formamide in the 
hybridization solution or by decreasing the temperature of the wash solution. High 
stringency conditions may involve high temperature hybridization (e.g., 65-68°C in 
aqueous solution containing 4-6 X SSC (1 X SSC = 0.15 M NaCl, 0.015 M sodium 

35 citrate), or 42°C in 50% formamide) combined washes at high temperature (e.g., 5- 
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25°C below the Tj„), in a solution having a low salt concentration (e.g., 0. 1 X SSC). 
Low stringency conditions may involve lower hybridization temperatures (e.g., 35- 
42*^0 in 20-50% formamide) with washes conducted at an intermediate temperature 
(e.g., 40-60°C) and in a wash solution having a higher salt concentration (e.g., 2-6 X 
5 SSC). Moderate stringency conditions, which may involve hybridization in 
0.2-0.3MNaCl at a temperature between 50°C and 65°C and washes in 0.1 X SSC, 
0.1% SDS at between 50°C and 55°C, may be used in conjunction with the disclosed 
polynucleotide molecules as probes to identify genomic or cDNA clones encoding 
members of the neuroD family. 

10 The invention provides isolated and purified polynucleotide molecules 

encoding neuroD proteins that are capable of hybridizing under stringent conditions to 
an oligonucleotide of 15 or more contiguous nucleotides of SEQ ID NO:l, SEQ ID 
N0:3, SEQ ID N0:8, SEQ ID NO: 10, SEQ ID NO: 12, SEQ ID NO: 14, and/or SEQ 
ID NO: 16, and also including the polynucleotide molecules complementary to the 

15 coding strands. The subject isolated neuroD polynucleotide molecules preferably 
encode neuroD proteins that trigger differentiation in ectodermal cells, particularly 
neuroectodermal stem cells, and in more committed cells of that lineage, for example, 
epidermal precursor cells, pancreatic and gastrointestinal cells. Such neuroD 
expression products typically form heterodimeric bHLH protein complexes that bind 

20 in the 5'-regulatory regions of target genes and enhance or suppress transcription of 
the target gene. 

In some instances, cancer cells may contain a non-flinctional neuroD protein 
or may contain no neuroD protein due to genetic mutation or somatic mutations such 
that these cells fail to differentiate. For cancers of this type, the cancer cells may be 

25 treated in a manner to cause the over-expression of wild-type neuroD protein to force 
differentiation of the cancer cells. 

Antisense neuroD nucleotide sequences, that is, nucleotide sequences 
complementary to the non-transcribed strand of a neuroD gene, may be used to block 
expression of mutant neuroD expression in neuronal precursor cells to generate and 

30 harvest neuronal stem cells The use of antisense oligonucleotides and their 
applications have been reviewed in the literature (see, for example, Mol and Van der 
Krul, eds., Antisense Nucleic Acids and Proteins Fundamentals and Applications, 
New York, NY, 1992; which is incorporated by reference herein in its entirety). 
Suitable antisense oligonucleotides are at least 11 nucleotide in length and may 

35 include untranslated (upstream or intron) and associated coding sequences. As v/ill be 
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evident to one skilled in the art, the optimal length of an antisense oligonucleotide 
depends on the strength of the interaction between the antisense oligonucleotide and 
the complementary mRNA, the temperature and ionic environment in which 
translation takes place, the base sequence of the antisense oUgonucleotide, and the 
5 presence of secondary and tertiary structure in the target mRNA and/or in the 
antisense oligonucleotide. Suitable target sequences for antisense oligonucleotides 
include intron-exon junctions (to prevent proper splicing), regions in which 
DNA/RNA hybrids will prevent transport of mRNA from the nucleus to the 
cytoplasm, initiation factor binding sites; ribosome binding sites, and sites that 

10 interfere with ribosome progression. A particularly preferred target region for 
antisense oligonucleotide is the 5* untranslated (promoter/enhancer) region of the gene 
of interest. Antisense oligonucleotides may be prepared by the insertion of a DNA 
molecule containing the target DNA sequence into a suitable expression vector such 
that the DNA molecule is inserted downstream of a promoter in a reverse orientation 

15 as compared to the gene itself The expression vector may then be transduced, 
transformed or transfected into a suitable cell resulting in the expression of antisense 
oligonucleotides. Alternatively, antisense oligonucleotides may be synthesized using 
standard manual or automated synthesis techniques. Synthesized oligonucleotides 
may be introduced into suitable cells by a variety of means including electroporation, 

20 calcium phosphate precipitation, liposomes, or microinjection. The selection of a 
suitable antisense oligonucleotide administration method will be evident to one skilled 
in the art. With respect to synthesized oligonucleotides, the stability of antisense 
oligonucleotide-mRNA hybrids may be increased by the addition of stabilizing agents 
to the oligonucleotide. Stabilizing agents include intercalating agents that are 

25 covalently attached to either or both ends of the oligonucleotide. Oligonucleotides 
may be made resistant to nucleases by, for example, modifications to the 
phosphodiester backbone by the introduction of phosphotriesters, phosphonates, 
phosphorothioates, phosphoroselenoates, phosphoramidates, or phosphorodithioates. 
Oligonucleotides may also be made nuclease resistant by synthesis of the 

30 oligonucleotides with alpha-anomers of the deoxyribonucleotides. 

NeuroD proteins bind to 5' regulatory regions of neurogenic genes that are 
involved in neuroectodermal differentiation, including development of neural and 
endocrine tissues. As described in more detail herein, murine neuroDl has been 
detected in neuronal, pancreatic and gastrointestinal tissues in embryonic and adult 

35 mice suggesting that neuroDl functions in the transcription regulation in these tissues. 
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NeuroD proteins alter the expression of subject genes by, for example, down- 
regulating or up-regulating transcription, or by inducing a change in transcription to 
an ahemative open reading frame. The subject polynucleotide molecules find a 
variety of uses, e.g., in preparing oligonucleotide probes, expression vectors, and 
5 transformed host cells, as disclosed below in the following Examples. 

DNA sequences recognized by the various neuroD proteins may be 
determined using a number of methods known in the literature including 
immunoprecipitation (Biedenkapp et al. Nature 335:835-837, 1988; Kinzler and 
Vorgelstein, Nuc. Acids Res, 17:3645-3653, 1989; and Sompayrac and Danna, Proc. 

10 Natl Acad, ScL USA 87:3274-3278, 1990; which are incorporated by reference 
herein), protein affinity columns (Oliphant et al., Mol Cell Biol 9:2944-2949, 1989; 
which is incorporated by reference herein), gel mobility shifts (Blackwell and 
Weintraub, Science 250:1 104-1 1 10, 1990; which is incorporated by reference herein), 
and Southwestern blots (Keller and Maniatis, Nuc, Acids Res. 17:4675-4680. 1991; 

15 which is incorporated by reference herein). 

One embodiment of the present invention involves the construction of inter- 
species hybrid neuroD proteins and hybrid neuroD proteins containing at least one 
domain from two or more neuroD family members to facilitate structure-function 
analyses or to alter neuroD activity by increasing or decreasing the neuroD-mediated 

20 transcriptional activation of neurogenic genes relative to the wild-type neuroD(s). 
Hybrid proteins of the present invention may contain the replacement of one or more 
contiguous amino acids of the native neuroD protein with the analogous amino acid(s) 
of neuroD from another species or other protein of the neuroD family. Such 
interspecies or interfamily hybrid proteins include hybrids having whole or partial 

25 domain replacements. Such hybrid proteins are obtained using recombinant DNA 
techniques. Briefly, DNA molecules encoding the hybrid neuroD proteins of interest 
are prepared using generally available methods such as PCR mutagenesis, site- 
directed mutagenesis, and/or restriction digestion and ligation. The hybrid DNA is 
then inserted into expression vectors and introduced into suitable host cells. The 

30 biological activity may be assessed essentially as described in the assays set forth in 
more detail in the Examples that follow. 

The invention also provides synthetic peptides, recombinantly derived 
peptides, fusion proteins, and the like that include a portion of neuroD or the entire 
protein. The subject peptides have an amino acid sequence encoded by a nucleic acid 

35 which hybridizes under stringent conditions v^th an oligonucleotide of 15 or more 
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contiguous nucleotides of SEQ ID NO: I, SEQ ID N0:3, SEQ ID NO:8, SEQ ID 
NO: 10. SEQ ID NO: 12, SEQ ID NO: 14, or SEQ ID NO: 16. Representative amino 
acid sequences of the subject peptides are disclosed in SEQ ID N0:2, SEQ ID N0:4, 
SEQ ID N0:9, SEQ ID NO: 11, SEQ ID NO: 13, SEQ ID NO: 15, and SEQ ID 
5 NO: 17. The subject peptides find a variety of uses, including preparation of specific 
antibodies and preparation of agonists and antagonists of neuroD activity. 

As noted above, the invention provides antibodies that bind to neuroD 
proteins. The production of non-human antisera or monoclonal antibodies (e.g., 
murine, lagormorph, porcine, equine) is well known and may be accomplished by, for 

10 example, immunizing an animal with neuroD protein or peptides. For the production 
of monoclonal antibodies, antibody producing cells are obtained fi-om immunized 
animals, immortalized and screened, or screened first for the production of the 
antibody that binds to the neuroD protein or peptides and then immortalized. It may 
be desirable to transfer the antigen binding regions (e.g., F(ab')2 or hypervariable 

15 regions) of non-human antibodies into the fi-amework of a human antibody by 
recombinant DNA techniques to produce a substantially human molecule. Methods 
for producing such "humanized" molecules are generally well known and described in, 
for example, U.S. Patent No. 4,816,397; which is incorporated by reference herein in 
its entirety. Alternatively, a human monoclonal antibody or portions thereof may be 

20 identified by first screening a human B-cell cDNA library for DNA molecules that 
encode antibodies that specifically bind to the neuroD family member, e.g., according 
to the method generally set forth by Huse et al. {Science 246:1275-1281, 1989, which 
is incorporated by reference herein in its entirety). The DNA molecule may then be 
cloned and amplified to obtain sequences that encode the antibody (or binding 

25 domain) of the desired specificity. 

The invention also provides methods for inducing the expression of genes 
associated with neuronal phenotype in a cell that does not normally express those 
genes. Examples of neuronal phenotypes that may be modulated by neuroD 
expression include expression of neurotransmitters or neuromodulatory factors. Cells 

30 that can be used for the purpose of modulation of gene expression by neuroD include 
cells of the neuroectodermal lineage, glial cells, neural crest cells, and epidermal 
epithelial basal stem cells, and all types of both mesodermal and endodermal lineage 
cells. NeuroD expression may also be used v^thin methods that induce expression of 
genes associated with pancreatic and gastrointestinal phenotype. Examples of such 
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gene expression include insulin expression, and gastrointestinal-specific enzyme 
expression. 

As illustrated in Example 10, the expression of Xenopus neuroDl protein in 
stem cells causes redirection of epidermal cell differentiation and induces terminal 
5 differentiation into neurons, i.e., instead of epidermal cells. Epithelial basal stem cells 
(i.e., in skin and mucosal tissues) are one of the few continuously regenerating cell 
types in an adult mammal. Introduction of the subject nucleotide sequences into an 
epithelial basal stem cell may be accomplished in vitro or in vivo using a suitable gene 
therapy vector delivery system (e.g., a retroviral vector), a microinjection technique 

10 (see, for example, Tam, Basic Life Sciences 37:187.194. 1986, which is incorporated 
by reference herein in its entirety), or a transfection method (e.g., naked or liposome 
encapsulated DNA or RNA; see, for example. Trends in Genetics 5:138, 1989; Chen 
and Okayama. Biotechniques 6:632-638, 1988; Mannino and Gould-Fogerite, 
Biotechniques 6:682-690, 1988; Kojima et al.. Biochem, Biophys, Res. Comm. 

15 207:8-12, 1995; which are incorporated by reference herein in their entirety). The 
introduction method may be chosen to achieve a transient expression of neuroD in the 
host cell, or it may be preferable to achieve constitutive or regulated expression in a 
tissue specific manner. 

Transformed host cells of the present invention find a variety of in vitro uses, 

20 for example: i) as convenient sources of neuronal and other growth factors, ii) in 
transient and continuous cultures for screening anti-cancer drugs capable of driving 
terminal differentiation in neural tumors, iii) as sources of recombinantly expressed 
neuroD protein for use as an antigen in preparing monoclonal and polyclonal 
antibodies useful in diagnostic assays, and iv) in transient and continuous cultures for 

25 screening for compounds capable of increasing or decreasing the activity of neuroD. 

Transformed host cells of the present invention also find a variety of in vivo 
uses, for example, for transplantation at sites of traumatic neural injury where motor 
or sensory neural activity has been lost. Representative patient populations that may 
benefit from transplantation include: patients with hearing or vision loss due to 

30 optical or auditory nerve damage, patients with peripheral nerve damage and loss of 
motor or sensory neural activity, and patients with brain or spinal cord damage from 
traumatic injury. For example, donor cells from a patient such as epithelial basal stem 
cells are cultured in vitro and then transformed or transduced with a neuroD 
nucleotide sequence. The transformed cells are then returned to the patient by 

35 microinjection at the site of neural dysfunction. In addition, as neuroD appears 
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capable of regulating expression of insulin, transformed host cells of the present 
invention may be useful for transplantation into patients with diabetes. For example, 
donor cells from a patient such as fibroblasts, pancreatic islet cells, or other pancreatic 
cells are harvested and transformed or transfected with a neuroD nucleotide sequence. 
5 The genetically engineered cells are then returned to the patient. In another 
embodiment, such engineered host cells may find use in the treatment of 
malabsorption syndromes. 

Representative uses of the nucleotide sequences of the invention include the 
following: 

10 L Construction of cDNA and oligonucleotide probes usefijl in Northern 

or Southern blots, dot-blots, or PGR assays for identifying and quantifying the level of 
expression of neuroD in a cell. High level expression of neuroD in neuroendocrine 
tumors and in rapidly proliferating regions of embryonic neural development (see 
below) indicates that measuring the level of neuroD expression may provide 

1 5 prognostic markers for assessing the growth rate and invasiveness of a neural tumor. 
In addition, considering the important role of neuroD in embryonic development it is 
thought highly likely that birth defects and spontaneous abortions may result from 
expression of an abnormal neuroD protein. In this case, neuroD may prove highly 
useful in prenatal screening of mothers and/or for in utero testing of fetuses. 

20 2. Construction of recombinant cell lines, ova, and transgenic embryos 

and animals including dominant-negative and "knock-out" recombinant cell lines in 
which the transcription regulatory activity of neuroD protein is down-regulated or 
eliminated. Such cells may contain altered neuroD coding sequences that result in the 
expression of a neuroD protein that is not capable of enhancing, suppressing or 

25 activating transcription of the target gene. The subject cell lines and animals find uses 
in screening for candidate therapeutic agents capable of either substituting for a 
function performed by neuroD or correcting the cellular defect caused by a defective 
neuroD. Considering the important regulatory role of neuroD in embryonic 
development, birth defects may occur from expression of mutant neuroD proteins, 

30 and these defects may be correctable in utero or in early post-natal life through the 
use of compounds identified in screening assays using neuroD. In addition, neuroD 
polynucleotide molecules may be joined to reporter genes, such as P-galactosidase or 
luciferase, and inserted into the genome of a suitable embryonic host cell such as a 
mouse embryonic stem cell by, for example, homologous recombination (for review, 

35 see Capecchi, Trends in Genetics 5:70-76, 1989; which is incorporated by reference). 



wo 97/1 6548 PCT/US96/1 7532 

-18- 



Cells expressing neuroD may then be obtained by subjecting the differentiating 
embryonic cells to cell sorting, leading to the purification of a population of 
neuroblasts. Neuroblasts may be useful for studying neuroblast sensitivity to growth 
factors or chemotherapeutic agents. The neuroblasts may also be used as a source 
5 fi-om which to purify specific protein products or gene transcripts. These products 
may be used for the isolation of growth factors, or for the identification of cell surface 
markers that can be used to purify stem cell population fi-om a donor for 
transplantation. 

As illustrated in Example 14, "knock-out" mice were generated by replacing 

10 the murine neuroDJ coding region with the P-galactosidase reporter gene and the 
neomycin resistance gene to assess the consequences of eliminating the murine 
neuroDl protein and to examine the tissue distribution of neuroDl in fetal and 
postnatal mice. Mice that were homozygous for the mutation (lacking neuroDl) had 
diabetes, as demonstrated by high blood glucose levels, and died by day four 

15 Homozygous mutants had blood glucose levels between 2 and 3 times the blood 
glucose level of wild-type mice. Heterozygous mutants exhibited similar blood 
glucose levels as wild-type mice. Examination of stained tissue from fetal and 
postnatal mice heterozygous for the mutation confirmed the neuroDl expression 
pattern in neuronal cells demonstrated by in situ hybridization (Example 4) and also 

20 demonstrated neuroD expression in the pancreas and gastrointestinal tract. 

"Knock-out" mice may be useful as a model system for diabetes. Such mice 
may be used to study methods to rescue homozygous mutants and as hosts to test 
transplant tissue for treating diabetes. 

3. Construction of gene transfer vectors (e.g., retroviral vectors, and the 

25 like) wherein neuroD is inserted into the coding region of the vector under the control 
of a promoter. NeuroD gene therapy may be used to correct traumatic neural injury 
that has resulted in loss of motor or sensory neural function, and also for the 
treatment of diabetes. For these therapies, gene transfer vectors may either be 
injected directly at the site of the traumatic injury, or the vectors may be used to 

30 construct transformed host cells that are then injected at the site of the traumatic 
injury. The results disclosed in Example 10 indicate that introduction of neuroDl 
induces a non-neuronal cell to become a neuron. This discovery raises for the first 
time the possibility of using transplantation and/or gene therapy to repair neural 
defects resulting from traumatic injury. In addition, the discovery of neuroDl 

35 provides the possibility of providing specific gene therapy for the treatment of certain 
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neurological disorders such as Alzheimer's disease, Huntington's disease, and 
Parkinson's disease, in which a population of neurons have been damaged. Two basic 
methods of neuroDI utilization are envisioned in this regard. In one method, 
neuroDI is expressed in existing populations of neurons to modulate aspects of their 
5 neuronal phenotype (e.g., neurotransmitter expression or synapse targeting) to make 
the neurons express a factor or phenotype to overcome the deficiency that contributes 
to the disease. In this method, recombinant neuroDI sequences are introduced into 
existing neurons or endogenous neuroDI expression is induced. In another method, 
neuroDI is expressed in non-neuronal cells (e.g., glial cells in the brain or another 
10 non-neuronal cell type such as basal epithelial cells) to induce expression of genes that 
confer a complete or partial neuronal phenotype that ameliorates aspects of the 
disease. As an example, Parkinson's disease is caused, at least in part, by the death of 
neurons that supply the neurotransmitter dopamine to the basal ganglia. Increasing 
the levels of neurotransmitter ameliorates the symptoms of Parkinson's disease. 
15 Expression of neuroDI in basal ganglia neurons or glial cells may induce aspects of a 
neuronal phenotype such that the neurotransmitter dopamine is produced directly in 
these cells. It may also be possible to express neuroDI in donor cells for 
transplantation into the aflFected region, either as syngeneic or allogeneic 
transplantations. Within yet another embodiment, neuroDI is expressed in non- 
20 pancreatic cells to induce expression of genes that confer a complete or partial 
pancreatic phenotype that ameliorates aspects of diabetes. Within yet another 
embodiment, neuroDI is expressed in pancreatic islet cells to induce expression of 
genes that induce the expression of insulin. 

4. Preparation of transplantable recombinant neuronal precursor cell 
25 populations from embryonic ectodermal cells, non-neural basal stem cells, and the 
like. Establishing cultures of non-malignant neuronal cells for use in therapeutic 
screening assays has proven to be a difficult task. The isolated polynucleotide 
molecules encoding neuroD proteins of the present invention permit the establishment 
of primary (or continuous) cultures of proliferating embryonic neuronal stem cells 
30 under conditions mimicking those that are active in development and cancer. The 
resultant cell lines find uses: i) as sources of novel neural growth factors, ii) in 
screening assays for anti-cancer compounds, and iii) in assays for identifying novel 
neuronal growth factors. For example, a high level of expression of neuroD was 
observed in the embryonic optic tectum, indicating that neuroDI protein may regulate 
35 expression of factors trophic for growing retinal cells. Such cells may be useftil 
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sources of growth factors, and may be usefijl in screening assays for candidate 
therapeutic compounds. 

The cell lines and transcription regulatory factors disclosed herein offer the 
unique advantage that since they are active very early in embryonic differentiation 
5 they represent potential switches, e.g., ON-^OFF or OFF->ON, controlling 
subsequent cell fate. If the switch can be shown to be reversible (i.e., ONoOFF), the 
neuroD transcription regulatory factor and neuroD nucleic acids disclosed herein 
provide exciting opportunities for restoring lost neural and/or endocrine functions in a 
subject. 

10 The following examples are offered by way of illustration and not by way of 

limitation. 

EXAMPLE 1 

Construction of the embryonic stem cell "179" cDNA library. 
A continuous murine embryonic stem cell line (i.e., the ES cell line) having 

15 mutant E2A (the putative binding partner of myoD) was used as a cell source to 
develop a panel of embryonic stem cell tumors. Recombinant ES stem cells were 
constructed (i.e., using homologous recombination) wherein both alleles of the 
putative myoD binding partner E2A were replaced with drug-selectable marker genes. 
ES cells do not make functional El 2 or E47 proteins, both of which are E2A gene 

20 products. ES cells form subcutaneous tumors in congenic mice (i.e., strain 129J) that 
appear to contain representatives of many different embryonal cell types as judged 
histologically and through the use of RT-PCR gene expression assays. Individual 
embryonic stem cell tumors were induced in male 129J strain mice by subcutaneous 
injection of 1 x 10^ cells/site. Three weeks later each tumor was harvested and used 

25 to prepare an individual sample of RNAs. Following random priming and second 
strand synthesis the ds-cDNAs were selected based on their size on 0.7% agarose gels 
and those cDNAs in the range of 400-800 bp were ligated to either Bam HI or Bgl II 
linkers. (Linkers were used to minimize the possibility that an internal Bam HI site in 
a cDNA might inadvertently be cut during cloning, leading to an abnormally sized or 

30 out-of-frame expression product.) The resultant individual stem cell tumor DNAs 
were individually ligated into the Bam HI cloning site in the "fl-VP16" 2^i yeast 
expression vector. This expression vector, fl-VP16, contains the VP 16 activation 
domain of Herpes simplex virus (HSV) located between Hind III (HIII) and Eco RI 
(RI) sites and under the control of the Saccharomyces cereviseae alcohol 

35 dehydrogenase promoter; with LEU2 and Ampicillin-resistance selectable markers. 
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Insertion of a DNA molecule of interest into the Hind III site of the fl-VP16 vector 
(i.e., 5' to the VP16 nucleotide sequence), or into a Bam HI site (i.e., 3' to the VP16 
sequence but 5' to the Eco RI site), results in expression of a VP 16 fusion protein 
having the protein of interest joined in-frame with VP 16. The resultant cDNA library 
5 was termed the " 1 79-hbrary" . 

EXAMPLE 2 
Identification and cDNA cloning of mouse neuroDL 
A two-hybrid yeast screening assay was used essentially as described by Fields 
and Song {Nature 340:245, 1989) and modified as described herein was used to 

10 screen the 179-library described in Example 1. Yeast two-hybrid screens are 
reviewed as disclosed in Fields and Stemglanz {Trends in Genetics 10:286-292, 
1994). The library was screened for cDNAs that interacted with LexA-Da, a fusion 
protein between the Drosophila Da (Daughteriess) bHLH domain and the prokaryotic 
LexA-DNA binding domain. The S, cerevisiae strain L40 contained multimerized 

15 LexA binding sites cloned upstream of two reporter genes, namely, the HIS3 gene, 
and the P-galactosidase gene, each of which was integrated into the L40 genome. 
The S. cereviseae strain L40 containing a plasmid encoding the Z^xA-Da fusion 
protein was transformed with CsCl gradient-purified fl-VP16-179-cDNA library. 
Transformants were maintained on medium selecting both plasmids (the LexA-Da 

20 plasmid and the cDNA library plasmid) for 16 hours before being subjected to 
histidine selection on plates lacking histidine, leucine, tryptophan, uracil, and lysine. 
Ciones that were HIS"*" were subsequently assayed for the expression of LacZ. To 
eliminate possible non-specific cloning artifacts, plasmids from YQS^ILacZ^ were 
isolated and transformed into S. cereviseae strain L40 containing a plasmid encoding 

25 a ZexA-Lamin fusion. Clones that scored positive in the interaction with lamin were 
discarded. Approximately 400 cDNA clones, which represented 60 different 
transcripts, were identified as positive in these assays. Twenty-five percent of the 
original clones were subsequently shown to be known bHLH genes on the basis of 
their reactivity with specific cDNA probes. One cDNA clone encoding a VP16-fusion 

30 protein that interacted with Da but not lamin was identified as unique by sequence 
analysis. This clone, initially termed tango, is now referred to as neuroDL 

The unique cDNA identified above, VP 1 6-neuroD, contained an 
approximately 450 bp insert that spanned the bHLH region. Sequence analysis 
showed that the clone contained an insert encoding a complete bHLH amino acid 

35 sequence motif that was unique and previously unreported. Further analysis 
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suggested that while the cDNA contained conserved residues common to all members 
of the bHLH protein family, several residues were unique and made it distinct from 
previously identified bHLH proteins. The DNA cloned in V?l6-neuroD is referred to 
as "neuroDI ." The neuroDJ cDNA insert was subcloned as a Bam HI-Not I insert 
into Bam HI-Not I linearized pBluescript SK"^. The resulting plasmid was designated 
pSK+1-83, 

The neuroDJ insert contained in the VP]6-neuroD plasmid was used to re- 
probe a mouse cDNA library prepared from mouse embryos at developmental stage 
elO.5. Candidate clones were isolated and sequenced essentially as described above. 
Several clones were isolated. One clone, designated pKS"^ m7a RX, was deposited at 
the American Type Culture Collection, 12301 Parklawn Drive, Rockville, MD 20852 
USA, on May 6, 1994, under accession number 75768. Plasmid pKS m7a RX 
contains 1646 bp of murine neuroDJ cDNA as an EcoRI-XhoI insert. The amino acid 
sequence encoded by the insert begins at amino acid residue +73 and extends to the 
carboxy-terminus of the neuroDl protein. The plasmid contains about 855 bp of 
neuroDl coding sequence (encoding amino acids 73-536). 

None of the mouse cDNAs contained the complete 5' coding sequence. To 
obtain the 5' neuroDJ coding sequence, a mouse strain 129/Sv genomic DNA library 
was screened with the V?l6-neuroD plasmid insert (450 bp). Genomic clones were 
isolated and sequenced and the sequences were aligned with the cDNA sequences. 
Alignment of the sequence and comparison of the genomic 5' coding sequences with 
the Xenopus neuroDJ clone (Example 8) confirmed the 5' neuroDJ coding sequence. 
The complete neuroDJ coding sequence and deduced amino acid sequence are shown 
in SEQIDNOS:! and 2. 

EXAMPLE 3 
NeuroD/neuroD 

bHLH proteins share conunon structural similarities that include a basic region 
that binds DNA and an HLH region involved in protein-protein interactions required 
for the formation of homodimers and heterodimeric complexes. A comparison of the 
amino acid sequence of the basic region of murine neuroDl (amino acids 102 to 113 
of SEQ ID N0:2) with basic regions of other bHLH proteins revealed that murine 
neuroD contained all of the conserved residues characteristic among this family of 
proteins. However, in addition, neuroDl contained several unique residues. These 
unique amino acid residues were not found in any other known HLH, making 
neuroDl a distinctive new member of the bHLH family. The NARERNR basic region 
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motif in neuroD (amino acids 107-113 of SEQ ID N0:2) is also found in the 
Drosophila AS-C protein, a protein thought to be involved in neurogenesis. Similar, 
but not identical, NARERRR and NERERNR motifs (SEQ ID N0S:5 and 6, 
respectively) have been found in the Drosophila Atonal and MASH (mammalian 
achaete-scute homolog) proteins, respectively, which are also thought to be involved 
in neurogenesis. The NARER motif (SEQ ID N0:7) of neuroDJ is shared by other 
bHLH proteins, and the Drosophila Daughterless (Da) and Mammalian E proteins. 
The basic region of bHLH proteins is important for DNA binding site recognition, and 
there is homology between neuroDl and other neuro-proteins in this functional 
region. Within the important dimer-determining HLH region of neuroDl, a low level 
of homology was recorded with mouse twist protein (i.e., 51% homology) and with 
MASH (i.e., 46% homology). NeuroDl contains several regions of unique peptide 
sequence within the bHLH domain including the junction sequence (MHG) 

EXAMPLE 4 

Tissue expression patterns of neuroDl, neuroD2, and neuroDS 
NeuroDl expression was analyzed during embryonic development of mouse 
embryos using /w situ hybridization. The probe used was an antisense neuroDl 
single-stranded riboprobe labeled with digoxigenin (Boehringer Mannheim). Briefly, 
a riboprobe was prepared from plasmid pSK4-l-83 using T7 polymerase and 
digoxigenin- II -UTP for labeling. The hybridized probe was detected using anti- 
digoxigenin antibody conjugated with alkaline phosphatase. Color development was 
carried out according to the manufacturer's instructions. Stages of development are 
commonly expressed as days following copulation and where formation of the vaginal 
plug is eO.5. The resuhs recorded in the in situ hybridization studies were as follows: 
In the e9.5 mouse embryo, neuroDl expression was observed in the 
developing trigeminal ganglia. 

In the el 0.5 mouse embryo, a distinctive pattern of neuroDl expression was 
observed in all the cranial ganglia (i.e., V-XI) and in dorsal root ganglia (DRG) in the 
trunk region of the embryo. At this time, neuroDl expression was also observed in 
the central nervous system in post-mitotic cells in the brain and spinal cord that were 
undergoing neuronal differentiation. In the spinal cord, the ventral portion of the cord 
from which the motor neurons arise and differentiate was observed to express 
neuroDl at high levels; and expression in the posterior-ventral spinal cord was higher 
when compared to more mature anterior-ventral spinal cord. 
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In the el 1.5 mouse embryo, the ganglionic expression pattern of neuroDI 
observed in el 0.5 persisted. Expression in the spinal cord was increased over the 
level of expression observed in el 0.5 embryos, which was consistent with the 
presence of more differentiating neurons at this stage. At this stage neuroDl 
5 expression was also observed in other sensory organs in which neuronal 
differentiation occurs, for example, in the nasal epithelium, otic vesicle, and retina of 
the eye. In both of these organs neuroDi expression was observed in the region 
containing differentiating neurons. 

In the el 4. 5 mouse embryo, expression of neuroDJ was observed in cranial 

10 ganglia and DRG, but expression of neuroDl persisted in the neuronal regions of 
developing sensory organs and the central nervous system (CNS). Thus, neuroDl 
expression was observed to be transient during neuronal development. 

In summary, expression of neuroDl in the neurula stage of the embryo (elO). 
in the neurogenic derivatives of neural crest cells, the cranial and dorsal root ganglia, 

15 and post mitotic cells in the CNS suggests an important possible link between 
expression and generation of sensory and motor nerves. Expression occurring later in 
embryonic development in differentiating neurons in the CNS and in sensory organs 
(i.e., nasal epithelium and retina) also supports a role in development of the CNS and 
sensory nervous tissue. Since neuroDl expression was transient, the results suggest 

20 that neuroDl expression is operative as a switch controlling formation of sensory 
nervous tissue. It is noteworthy that in these studies neuroDl expression was not 
observed in embryonic sympathetic and enteric ganglia (also derived fi-om migrating 
neural crest cells). Overall, the results indicate that neuroDl plays an important role 
in neuronal differentiation. 

25 In addition to the in situ studies described above. Northern blot analysis was 

done to determine in what tissues of the mouse neuroDl, neuroD2, and neuroDS 
were expressed. Total RNA was isolated from whole mouse embryos and adult 
mouse tissues. RNA isolation was performed using RNazol B according to the 
protocol provided (Cinna/Biotex CS-105B). RNA was size fractionated on 1.5% 

30 agarose gels and transferred to Hybond-N membranes. Hybridization was carried out 
in 7% SDS. 0.25 M Na2P04, lOmg/ml BSA, 1 mM EDTA at 65"C for at least 5 hours 
and then washed in O.IX SSC and 0,1% SDS at 55°C-60°C. Probes for analyzing 
mouse mRNA were prepared from fragments representing the divergent carboxy- 
terminal regions 3 -prime of the bHLH domain to avoid cross-hybridization between 

35 genes. Probe for neuroDl was made from a 350 base pair PstI fragment from the 
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mouse neuroDJ cDNA (Lee et al., 1995) that encompasses the region coding for 
amino acids 187-304; probe for neuroDl was made from a 635 base pair PstI 
fragment from the mouse neuroD2 cDNA that encompasses the region from amino 
acid 210 through to the 3 -prime non-translated region; and probe for neuroD3 was 
made from a 400 base pair Apal-BamHI fragment from the neuroDS genomic region 
that is 3-prime to the region coding the bHLH domain. 

After labeling with ^^P, the above-described fragments were used to probe 
Northern blots containing RNAs prepared from various tissues of newborn and adult 
mice. Both neuroDl and neuroDl were detected in the brain of both newborn and 
adult mice, whereas, muroDS transcripts were not detected in any of the tissues 
tested. RNA extracted from dissected regions of the adult mouse nervous system 
demonstrated that nevroDJ was more abundant in the cerebellum than the cortex, 
whereas neuroD2 was expressed at relatively equivalent levels in both cerebellum and 
cortex. 

To determine when during mouse embryonic development neuroD2 and 
neuroDS were expressed in comparison to neuroDI, RNA was prepared from whole 
embryos at various developmental stages. In accord with previous reports (Lee et al., 
1995), neuroDl mRNA was first detected at low levels at embryonic day 9.5 and at 
increasing levels through embryonic day 12.5, the latest embryonic stage tested. 
NeuroDl mRNA was first detected at embryonic day 11 and also increased in 
abundance through embryonic day 12,5. Although we did not detect neuroDS in the 
adult tissues, the embryonic expression pattern showed a transient expression between 
embyronic day 10 and 12 and then declined to undetectable levels by embryonic day 
16. Collectively, these data demonstrate that neuroDS is expressed transiently during 
embryogenesis, similar to the expression pattern of MATH! (Akazawa et al., 1995), 
and that the temporal expression of neuroDl and neuroD2 partly overiap with 
neuroDS, but that their expression persists in the adult nervous system. 

EXAMPLE 5 

NeuroDl is expressed in neural and brain tumor cells: murine probes identify 

human neuroDl . 

Given the expression pattern in mouse embryo (Example 4), Northern blots of 
tumor cell line mRNAs were examined using murine neuroDl cDNA (Example 2) as 
a molecular probe. As a first step, cell lines that have the potential for developing into 
neurons were screened. The D283 human meduUablastoma cell line, which expressed 
many neuronal markers, expressed high levels of neuroDl by Northern blot analysis. 
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NeuroDJ was also transcribed at various levels by different human neuroblastoma cell 
lines and in certain rhabdomyosarcoma lines that are capable of converting to neurons. 

EXAMPLE 6 
Recombinant cells expressing NeuroDl. 
5 Recombinant murine 3T3 fibroblast cells expressing either a myc-tagged 

murine neuroDl protein or myc-tagged Xenopus neuroDl protein were made. The 
recombinant cells were used as a test system for identifying antibody to neuroD 
described below. 

Xenopus neuroD 1 protein was tagged with the antigenic marker Myc to allow 

10 the determination of the specificity of anti-neuroDl antibodies to be determined. 
Plasmid CS2+MT was used to produce the Myc fusion protein. The CS2+MT vector 
(Turner and Weintraub. ibid.) contains the simian cytomegalovirus IE94 
enhancer/promoter (and an SP6 promoter in the 5' untranslated region of the IE94- 
driven transcript to allow in vitro RNA synthesis) operatively linked to a DNA 

15 sequence encoding six copies of the Myc epitope tag (Roth et al, J. Cell Bioi 
115:587-596, 1991, which is incorporated herein in its entirety), a polylinker for 
insertion of coding sequences, and an SV40 late polyadenylation site. CS2-MT was 
digested with Xho I to linearize the plasmid at the polylinker site downstream of the 
DNA sequence encoding the Myc tag. The linearized plasmid was blunt-ended using 

20 KJenow and dNTPs. A full length Xenopus neuroDl cDNA clone was digested with 
Xho I and Eael and blunt-ended using Klenow and dNTPs. and the 1.245 kb 
fragment of the Xenopus neuroDl cDNA was isolated. The neuroDl fragment and 
the linearized vector were ligated to form plasmid CS2+MT xl-83. 

CS2+MT was digested with Eco RI to linearize the plasmid at the polylinker 

25 site downstream of the DNA sequence encoding the Myc tag. The linearized plasmid 
was blunt-ended using Klenow and dNTPs and digested with Xho I to obtain a 
linearized plasmid having an Xho I adhesive end and a blunt end. Plasmid pKS+m7a 
containing a partial murine neuroDl cDNA was digested with Xho I, and the 
neuroDl containing fragment was blunt-ended and digested with Xba I to obtain the 

30 approximately 1.6 kb fragment of the murine neuroDl cDNA. The neuroDl 
fragment and the linearized vector were ligated to form plasmid CS2+MT Ml- 
83(m7a). 

Plasmids CS2+MT xl-83 and CS2+MT Ml-83(m7a) were each transformed 
into murine 3T3 fibroblast cells and used as a test system for identifying antibody 
35 against neuroDl (Example 7). 
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EXAMPLE? 
Antibodies to NeuroDI . 
A recombinant fusion protein of maltose binding protein (MBP) and amino 
acid residues 70-355 of murine neuroDl was used as an antigen to evoke antibodies 
5 in rabbits. Specificity of the resultant antisera was confirmed by immunostaining of 
the recombinant 3T3 cells described above. Double-immunostaining of the 
recombinant cells was observed with monoclonal antibodies to Myc (i.e., the control 
antigenic tag on the transfected DNA) and with rabbit anti-murine neuroDl in 
combination with anti-rabbit IgG. The ^ specificity of the resultant anti-murine 
10 neuroDl sera was investigated fiirther by preparing mouse 3T3 fibroblasts cells 
transfected with different portions of neuroDl DNA. Specificity seemed to map to 
the glutamic acid-rich domain (i.e., amino acids 66-73 of SEQ ID N0:2). The anti- 
murine antisera did not react with cells transfected with the myc-tagged Xenopus 
neuroDl. In a similar manner, Xenopus neuroDl was used to generate rabbit anti- 
15 neuroD antisera. The antisera was ^e/zcpt/^-specific and did not cross react with cells 
transfected with Myc-tagged murine muroDL 

EXAMPLES 

NeuroDl is a highly evolutionarily conserved protein: sequence of Xenopus neuroDl. 
Approximately one million clones from a stage \1 Xenopus head cDNA library 

20 made by Kintner and Melton {Development 99:311, 1987) were screened with the 
mouse cDNA insert as a probe at low stringency. The hybridization was performed 
with 50% formamide/4 X SSC at 33°C and washed with 2 X SSC/0.1% SDS at 40^C. 

Positive clones were identified and sequenced. Analysis of the Xenopus 
neuroDl cDNA sequence (SEQ ID N0:3) revealed that neuroDl is a highly 

25 conserved protein between frog and mouse. The deduced amino acid sequences of 
fi-og and mouse (SEQ ID N0S:2 and 4) show 96% identity in the bHLH domain (50 
of 52 amino acids are identical) and 80% identity in the region that is carboxy- 
terminal to the bHLH domain (159 of 198 amino acids are identical). The domain 
structures of murine and Xenopus neuroDl are highly homologous with an "acidic" 

30 N-terminal domain (i.e., glutamic or aspartic acid rich); a basic region; helix 1, loop, 
helix 2; and a proline rich C-terminal region. Although the amino terminal regions of 
murine and Xenopus neuroDl differ in amino acid sequence, both retain a glutamic or 
aspartic acid rich "acidic domain" (amino acids 102 to 113 of SEQ ID NO:2 and 
amino acids 56 to 79 of SEQ ID N0:4). It is highly likely that the acidic domain 

35 constitutes an "activation" domain for the neuroDl protein, in a manner analogous to 
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the activation mechanisms currentiy understood for other known transcription 
regulatory factors. 

EXAMPLE 9 
Neuronal expression of Xenopus neuroDL 
The expression pattern of neuroDl in whole mount Xenopus embryos was 
determined using in situ hybridization with a single stranded digoxigenin-Iabeled 
Xenopus neuroDl antisense cDNA riboprobe Embryos were examined at several 
different stages. 

Consistent with the mouse expression pattern, by late stage, all cranial ganglia 
showed very strong staining patterns. In Xenopus, as in other vertebrate organisms, 
neural crest cells give rise to skeletal components of the head, all ganglia of the 
peripheral nervous system, and pigment cells. Among these derivatives, the cranial 
sensory ganglia, which are of mixed crest and placode origin, represent the only group 
of cells that express neuroDJ. High levels of neuroDl expression in the eye were also 
observed, correlating with active neuronal differentiation in the retina at this stage. 
Expression is observed in the developing olfactory placodes and otic vesicles, as was 
seen in mice. The pineal gland also expressed neuroDl . All of this expression was 
transient, suggesting that neuroDl functions during the differentiation process but is 
not required for maintenance of these differentiated cell types. 

As early as stage 14 (i.e., the mid-neurula stage) neuroDl expression was 
observed in the cranial neural crest region where trigeminal ganglia differentiate. 
Primary mechanosensory neurons in the spinal cord, also referred to as Rohon-Beard 
cells and primary motor neurons, showed neuroDl expression at this stage. 

By stage 24, all of the developing cranial ganglia, trigeminal, facio-acoustic, 
glosso-pharyngeal, and vagal nervous tissues showed a high level of neuroDl 
expression. High levels of expression of neuroDi were also observed in the eye at 
this stage. (Note that in Xenopus neuronal dififerentiation in the retina occurs at a 
much earlier stage than in mice, and neuroDl expression was correspondingly earlier 
and stronger in this animal model.) 

In summary, in Xenopus as in mouse, neuroDl expression was correlated with 
sites of neuronal differentiation. The remarkable evolutionary conservation of the 
pattern of neuroDl expression in differentiating neurons supports the notion that 
neuroDl has been evolutionarily conserved both structurally and functionally in these 
distant classes, which underscores the critical role performed by this protein in 
embryonic development. 
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EXAMPLE 10 

Expression of neuroDl and neuroDl converts non-neuronal cells into neurons. 
To further analyze the biological functions of neuroDl, a gain-of-flinction 
assay was conducted. In this assay, RNA was microinjected into one of the two cells 
5 in a 2-cell stage Xenopus embryo, and the effects on later development of neuronal 
phenotype were evaluated. For these experiments myc-tagged Xenopus neuroDI 
transcripts were synthesized in vitro using SP6 RNA polymerase. The myc-tagged- 
neuroDl transcripts were microinjected into one of the two cells in a Xenopus 2-cell 
embryo, and the other cell of the embryo served as an internal control. 

10 Synthesis of capped RNA for the Xenopus laevis injections was done 

essentially as described (Kreig, P. A. and D. A. Melton., Meih. Enzymoi 
155:397-415, 1987) using the SP6 transcription of the pCS2-hND2, pCS2-hNDl, 
pCS2-mND2, and pCS2MT-mND2. The capped RNA was phenol/chloroform 
extracted followed by separation of unincorporated nucleotides using a G-50 spin 

15 column. Approximately 350 pg or capped RNA was injected into one cell of 2-cell 
stage albino Xenopus laevis embryo in a volume of approximately 5 nl, as described 
previously (Turner and Weintraub, 1994). Embryos were allowed to develop in 
0. IX modified Earth's saline (MBS) and staged according to Nieukwoop and Faber 
(Nieuwkoop. P.D. and J. Faber, "Normal Table of Xenopus laevis" North-Holland 

20 Publishing Co., Amsterdam. The Netheriands, 1967). Embryos were fixed in 
MEMFA for 2 hours at room temperature and stored in methanol. Embryos were 
hydrated through a graded series of methanol/PBS solutions and prepared for 
immunohistochemistry as described (Turner and Weintraub, 1994). The embryos 
were stained with an anti-NCAM antibody (Balak et al. Develop, Biol. 119:540-550, 

25 1987) diluted 1:500 (gift of Urs Rutishauser) followed by a goat anti-rabbit alkaline 
phosphatase conjugated secondary antibody, or stained with the monoclonal anti-myc 
tag 9el0 antibody. Presence of the antibody was visualized by NBT/BCIP color 
reaction according to protocol provided (Gibco). 

Antibodies to Xenopus N-CAM, a neural adhesion molecule, anti-Myc (to 

30 detect the exogenous protein tag), and immunostaining techniques were used to 
evaluate phenotypic expression of the neuronal marker (and control) gene during the 
subsequent developmental stages of the microinjected embryos. Remarkably, an 
evaluation of over 130 embryos that were injected v^th neuroDI RNA showed a 
striking increase in ectopic expression of N-CAM on the microinjected side of the 

35 embryo (i.e., Myc ), as judged by increased immunostaining. The increased staining 
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was observed in the region from which neural crest cells normally migrate. It is 
considered likely that ectopic expression (or over-expression) of neuroDl caused 
neural crest stem cells to follow a neurogenic cell fate. Outside the neural tube, the 
ectopic immunostaining was observed in the facio-cranial region and epidermal layer, 
5 and in some cases the stained cells were in the ventral region of the embryo far from 
the neural tube. The immunostained cells not only expressed N-CAM ectopically, but 
displayed a morphological phenotype of neuronal cells. At high magnification, the N- 
CAM expressing cells exhibited typical neuronal processes reminiscent of axonal 
processes. 

10 To confinn that the ectopic N-CAM expression resulted from a direct effect 

on the presumptive epidermal cells and not from aberrant neural cell migration into 
the lateral and ventral epidermis, neuroDJ RNA was injected into the top tier of 32- 
cell stage embryos, in order to target the injection into cells destined to become 
epidermis. N-CAM staining was observed in the lateral and ventral epidermis without 

1 5 any noticeable effect on the endogenous nervous system, indicating that the staining 
of N-CAM in the epidermis represents the conversion of epidermal cell fate into 
neuronal cell fate. 

Ectopic generation of neurons by neuroDl was confirmed with other neural 
specific markers, such as neural-specific class II P-tubulin (Richter et al., Proc, Natl 

20 Acad. Sci. USA 85:8066, 1988), acetylated I-tubulin (Pipemo and Fuller, 1 Cell. 
Biol 101:2085, 1985), tanabin (Hemmati-Brinvanlou et al., Neuron 9:417, 1992), 
neurofilament(NF)-M (Szaro et al., J. Comp. Neurol 273:344, 1988), and Xen-1,2 
(Ruiz i Altaba. Development 115:67, 1992). The embryos were subjected to 
immunochemistry as described by Turner and Weintraub {Genes Dev. 8:1434, 1994, 

25 which is incorporated by reference herein) using primary antibodies detected with 
alkaline phosphatase-conjugated goat anti-mouse or anti-rabbit antibodies diluted to 
1:2000 (Boehringer-Mannheim). Anti-acetylated alpha-tubulin was diluted 1:2000. 
Anti-Xen-1 was diluted 1:1. Anti-NF-M was diluted 1:2000. Embiyos stained for 
NF-M were fixed in Dent's fixative (20% dimethylsulfoxide/80% methanol) and 

30 cleared in 2: 1 benzyl benzoate/benzyl alcohol as described by Dent et al. 
{Development 105:61, 1989, which is incorporated by reference herein). In situ 
hybridization of embryos was carried out essentially as described by Harland (in 
Methods in Cell Biology, B.K. Kay, H.J. Pend, eds.. Academic Press, New York, 
NY, Vol. 36, pp. 675-685, 1991, which is incorporated by reference herein) as 

35 modified by Turner and Weintraub (ibid.). In situ hybridization with P-tubulin 
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without RNase treatment can also detect tubulin expression in the ciliated epidermal 
cells. All of these markers displayed ectopic staining on the neuroDI RNA injected 
side. Injection of neuroD I mRNA into vegetal cells led to no ectopic expression of 
neural markers except in one embryo that showed internal N-CAM staining in the 
trunk region, suggesting the absence of cofactors or the presence of inhibitors in 
vegetal cells. However, the one embryo that showed ectopic neurons in the internal 
organ tissue suggests that it may be possible to convert non-ectodermal lineage cells 
into neurons under certain conditions. 

The embryos were also stained with markers that detect Rohon-Beard cells 
(cells in which neuroDI is normally expressed). Immunostaining using the method 
described above for Rohon-Beard cell-specific markers such as HNK-1 (Nordlander, 
Z)ev. Brain Res. 50:147, 1989, which is incorporated by reference herein) at a dilution 
of 1:1, Islet-1 (Ericson et al.. Science 256:1555, 1992 and Korzh et al., Development 
118:417, 1993) at a dilution of 1:500, and in situ hybridization as described above 
with shaker-1 (Ribera et al., 1 Neurosci, 13:4988, 1993) showed more cells staining 
on the injected side of the embryos. 

The combined results support the notion that ectopic expression of neuroD 1 
induced differentiation of neuronal cells from cells that, without neuroDI 
microinjection, would have given rise to non-neuronal cells. In summary, these 
experiments support the notion that ectopic neuroDI expression can be used to 
convert a non-neuronal cell (i.e., uncommitted neural crest cells and epidermal 
epithelial basal stem cells) into a neuron. These findings offer for the first time the 
potential for gene therapy to induce neuron formation in injured neural tissues. 

Interesting morphological abnormalities were observed in the microinjected 
embryos. In many cases the eye on the microinjected side of the embryo failed to 
develop. In other embryos, the spinal cord on the microinjected side of the embryo 
failed to develop properly, and the tissues were strongly immunopositive when stained 
with anti-N-CAM. In addition, at the mid-neurula stage many microinjected embryos 
exhibited an increase in cell mass in the cranial region of the embryo from which (in a 
normal embryo) the neural crest cells and their derivatives (i.e., cranial ganglionic 
cells) would migrate. The observed cranial bulge exhibited strong immunostaining 
with antibodies specific for N-CAM. These resuhs were interpreted to mean that 
morphological changes in the eye, neural crest, and spinal cord resulted from 
premature neural differentiation which altered the migration of neural and neural crest 
precursor cells. 
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NeuroD l-injccxtd embryos were also assayed for alteration in the expression 
of Xtwist, the Xenopus homolog of Drosophila twist, to determine whether neuroDl 
converted non-neuronal components of neural crest cells into the neural lineage. In 
wild-type embryos, Xtwist is strongly expressed in the non-neuronal population 
5 cephalic neural crest cells that give rise to the connective tissue and skeleton of the 
head. NeuroD I -injecXtd embryos were completely missing Xtwist expression in the 
migrating cranial neural crest cells on the injected side. The failure to generate 
sufficient cranial mesenchymal neural crest precursors in wez/rc?Z)/-injected embryos 
was also observed morphologically, since .many of the injected embryos exhibited 

10 poor branchial arch development in the head. Furthermore, the increased mass of 
cells in the cephalic region stained very strongly for N-CAM, P-tubulin, and Xen-1, 
indicating that these cells were neural in character. 

The converse experiment in which frog embryos were injected with Xtwist 
mRNA showed that ectopic expression of Xtwist significantly decreased neuroDl 

1 5 expression on the injected side. Thus, two members of the bHLH family, neuroDl 
and Xtwist, may compete for defining the identity of different cell types derived fi-om 
the neural crest. In the wt/roZ)7-injected embryos, exogenous muroDI may induce 
premigratory neural crest to differentiate into neurons in situ, and consequently they 
fail to migrate to their normal positions. 

20 The effect of introduction of exogenous neuroDl on the fate of cells that 

normally express neuroDl, such as cranial ganglia, eye, otic vesicle, olfactory organs, 
and primary neurons, and on other CNS cells that normally do not express neuroDl, 
was determined by staining for differentiation markers. When the cranial region of the 
embryo was severely affected by ectopic neuroDl, the injected side of the embryos 

25 displayed either small or no eyes in addition to poorly organized brains, otic vesicles, 
and olfactory organs. Moreover, as the embryos grew, the spinal cord showed 
retarded growth, remaining thinner and shorter on the wet/roZ)7 -injected side. 

N-CAM staining in the normal embryo at eariy stages was not uniform 
throughout the entire neural plate, but rather was more prominent in the medial region 

30 of the neural plate. Injected embryos analyzed for N-CAM expression showed that 
the neural plate on the injected side of the early stage embryos was stained more 
intensely and more laterally. The increase in N-CAM staining was not associated with 
any lateral expansion of the neural plate as assayed by visual inspection and staining 
with the epidermal marker EpA. This was in contrast to what has been observed with 

35 XASH-3 injection that causes neural plate expansion. These observations suggest 
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that the first effects of neuroDJ are to cause neuronal precursors in the neural plate to 
differentiate prematurely. 

To determine whether neuroDl caused neuronal precursors to differentiate 
prematurely, injected embryos were stained using two neuronal markers that are 
5 expressed in differentiated neurons, neural specific p-tubulin and tanabin. In situ 
hybridization for p-tubulin and tanabin was carried out as described above. Over- 
expression of neuroDl dramatically increased the 3-tubulin signals in the region of the 
neural plate containing both motor neurons and Rohon-Beard cells at stage 14. The 
earliest ectopic 3-tubulin positive cells on the injected side were observed at the end 

10 of gastrulation when the control side did not yet show any P-tubulin positive cells. 
Tanabin was also expressed in more cells in the spinal cord in the neuroDl injected 
side of the embryos at stage 14. These results suggest that neuroDl can cause 
premature differentiation of the neural precursors into differentiated neurons. This is 
a powerful indication that, when ectopically expressed or over-expressed, neuroDl 

1 5 can differentiate mitotic cells into non-dividing mature neurons. 

To determine if neuroD2 also was capable of inducing ectopic neuronal 
development in the frog, mouse neuroDl RNA was injected into one side of a two 
cell X. laevis embryo, the uninjected side serving as a control. The neuroD2 mRNA 
was made from pCS2-MTmND2, an expression vector that was constructed as 

20 follows. Expression vectors were made in the pCS2+ or pCS2+MT (Turner, D.L. 
and H. Weintraub, Genes & Dev. 8:1434-1447, 1994), both contain the simian CMV 
promoter and the MT contains six copies of the myc epitope recognized by the 9el0 
monoclonal antibody (ATCC:CRL1729) cloned in-fi:ame upstream of the insert. The 
1.75 kb full length human neuroDl cDNA (Tamimi et al.. Genomics 34: 418-421, 

25 1996) from plasmid phcndl-17a was cloned into the EcoRI site to make pCS2-hNDl- 
17s (hereafter referred to as pCS2-hNDl). The 1.53 kb genomic region containing 
the entire coding sequence of the human neuroDl gene (described in Example 11) 
was cloned into the Stul-Xbal site to make pCS2-hND2-14Bl (hereafter referred to 
as pCS2-hND2). The mouse 1.95 kb neuroDl cDNA was cloned into the EcoRI- 

30 Xhol sites to make pCS2-mND2- 1.1.1 (hereafter referred to as pCS2-mND2). For 
the /nj/c-tagged construct, a synthetic oligonucleotide mediated mutagenesis was used 
to introduce an EcoRI site adjacent to the initial ATG codon to result in the m>'c-tag 
and neuroDl coding regions being in-frame to make pCS2MT-mND2. 

When injected into Xenopus laevis, mouse neuroDl mRNA was able to 

35 induce ectopic neuronal development as determined by immunohistochemistry with an 
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anti-NCAM antibody. An anti-myc tag antibody, 9E10, was used to confirm that 
most ectodermal cells on the injected side of the frog expressed the myc-tagged 
mouse neuroD2 and approximately 80-90% of injected embryos stained positively 
with either the anti-myc or anti-NCAM antibodies. Injection of RNA encoding the 
5 human neuroDI gene resulted in an ectopic neuronal phenotype similar to that seen 
with Xenopus neuroDl and murine neuroD2. This demonstrates that both neuroDJ 
and muroDI can regulate the formation of neurons and that the human and mouse 
neuroD2 proteins are capable of functioning in the developing Xenopus embryo. 

Developmental expression patterns suggest two distinct sub-families of 

10 neurogenic bHLH genes. MATH I and neuroDS share similarity in the bHLH region 
and have similar temporal expression patterns, vnih RNA expression detected around 
embryonic day 10, but not persisting in the mature nervous system. MATH- J RNA 
was localized to the dorsal neural tube in 10.5-11.5 day embryos, but by birth was 
present only in the external granule cell layer of the cerebellum, the progenitors of the 

15 cerebellar granule cell layer (Akazawa et al., 1995). In contrast, the neuroDI, 
neuroD2, and MATHl/NEX-l genes are expressed in both differentiating and mature 
neurons. Northern analysis demonstrated that neuroD2 expression begins around 
embryonic day 11 and continues through day 16, the latest embryonic time point 
tested. NeuroDl was detected in the brain of neonates as well as adult mice, with 

20 relatively equal abundance in both the cerebellum and cortex. Similar to neuroD2, the 
CNS expression of neuroDi persists postnatally, as well as does its expression in the 
beta cells of the pancreas (Naya et al., 1995). Northern blot analysis indicated that 
neuroDi expression in the adult mouse brain is most abundant in the cerebellum with 
lower levels in the cerebral cortex and brain stem. NEX-I/MATH-l gene expression 

25 is reported to occur by embryonic day 11. 5 and at embryonic day 15.5 its expression 
is limited to the intermediate zone adjacent to the mitotically active ventricular zone, 
suggesting that NEX'I/MATH2 is expressed primarily in the newly differentiating 
neurons at this stage (Bartholoma, A. and K. A. Nave, 1994; Shimizu et al., 1995). In 
mature brain, NEX'J/MATH-2 is expressed in neurons comprising the hippocampus, 

30 subsets of cortical neurons, and post migratory cerebellar granule cells, but the reports 
disagree on whether this gene is expressed in the dentate gyrus of the hippocampus. 
It is interesting to note that the Northern analysis of MATH 2 expression reported by 
Shimizu et al. (1995) shows high levels in the cerebral cortex and low levels in the 
cerebellum, the opposite of the expression pattern seen for neuroDl, suggesting that 

35 these genes may also have significant differences in relative abundance in specific 
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regions of the nervous system. Therefore, it appears that MATH- J and neuroDS are 
expressed early in nervous system development and may have a role in either 
determining or expanding a population of neuronal precursors, whereas the persistent 
expression of neuroDl, neuroD2 and NEX-IMA TH-l suggest a role in initiating and 
5 maintaining expression of genes related to neuronal differentiation. 

Kume et al. (Biochem. Biophys. Res. Comm. 219:526-530, 1996) have 
reported the cloning of a helix-loop-helix gene from rat brain using a strategy 
designed to identify genes that are expressed during tetanic stimulation of 
hippocampal neurons in a model of long-term-potentiation. The gene they describe, 

10 KW8, is the rat homolog of the mouse and human neuroD2 gene described here. 
Kume et al. also describe expression in the adult brain, including the hippocampus. 
Subsequently, Yasunami et al. {Biophys, Res. Comm. 220:754-758, 1996) reported 
the mouse NDRF gene, which is nearly identical to neuroD2 and demonstrates a 
similar expression pattern in adult brain by in situ hybridization. 

1 5 While expression of either neuroDl or neuroD2 in Xenopus leavis embryos 

resulted in ectopic neuronal development, it is interesting to note that neither neuroDl 
nor neuroD2 was capable of converting all cell types in which it was present into 
neurons. As in the case of neuroDl, the ectopic neurons induced by neuroD2 were 
confined to a subpopulation of ectodermal cells, as indicated by the spotty NCAM 

20 positive staining pattern. The apparent restricted activity of the neuroD proteins to a 
subset of cells derived from the ectoderm suggests that other factors may regulate 
their activity, such as the notch pathway that mediates lateral inhibition during 
Drosophila neurogenesis. 

While the induction of ectopic neurogenesis by both neuroDl and neuroD2 in 

25 Xenopus embryos suggests a similar function, the developmental expression patterns 
and in vitro transfection experiments indicate that the family members may serve both 
overlapping and distinct functions. Previous studies have demonstrated that 
neuroD/beta2 and NEX-1/MATH2 can bind the core CANNTG sequence of an E- 
box as a heterodimer with an E-protein and activate transcription. 

30 In the work presented here, it is shown that both neuroDl and neuroD2 can 

activate a construct containing multimerized E-boxes. They also activate a construct 
driven by a genomic fragment from the neuroD2 gene that presunfiably contains 
regulatory regions for neuroD2, and the temporal expression pattern of neuroDl and 
neuroD2 proteins in embryogenesis and P19 differentiation suggests a model in which 

35 neuroDl may activate neuroD2 expression during development. Most important, 
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however, is the demonstration that neuroDl and neuroDZ have different capacities to 
activate a construct driven by the core regulatory sequences of the GAP-43 gene, 
demonstrating that the highly related neuroDl and neuroD2 proteins are capable of 
regulating specific subsets of genes. This promoter contains several E-boxes and it 
5 remains to be determined if neuroD2 directly binds to these sites. 

In the bHLH region, neuroDl and neuroD2 differ by only 2 amino acids and it 
would be anticipated that they recognize the same core binding sequences. Therefore, 
the differential regulation of transcriptional activity may be determined independently 
of DNA binding. The amino acid following the histidine in the junction region of the 

10 basic region is a glycine in neuroDl, NEX-1/MATH2, and MATHl, an aspartate in 
neuroD2, and an asparagine in neuroD3, This residue is positioned at the same site as 
the lysine residue in the myogenic bHLH proteins that has been shown to be one of 
the critical for myogenic activity (Davis et al.. Cell 60:733-746, 1990; Davis, R. L. 
and H. Weintraub, Science 256:1027-1030, 1992; Weintraub et al., Genes & Dev. 

15 5:1377-1386, 1991). In this case, it has been postulated to be a site of potential 
interaction with co-activator factors that regulate transcriptional activity. If the 
neuroD proteins have a similar mechanism for exerting their regulatory activities, it is 
possible that amino acid variability in this amino acid mediates different target 
specificities. Alternatively, the more divergent amino- and carboxyterminal regions 

20 could confer regulation by interaction with other activators or repressors. 

The different expression patterns in the mature nervous system and the subtle 
differences in target genes is similar to myogenic bHLH proteins. In mature muscle, 
MyoD is expressed in fast muscle fibers and myogenin in slow fibers (Asakura et al., 
Develop, Biol. 171:386-398, 1995; Hughes et al., Development 118:1137-1147, 

25 1993) and transfection studies demonstrate that sequences adjacent to the core E-box 
sequence can diflFerentially regulate the ability of MyoD and myogenin to function as 
transcriptional activators (Asakura et al., Molec. & Cell. Biol, 13:7153-7162, 1993), 
presumably by interaction of other regulatory factors with the non-bHLH regions of 
MyoD and myogenin. For the nei/ro£>-related genes, the partially overiapping 

30 expression patterns and partially overiapping target genes suggest that they may act in 
a combinatorial fashion to directly regulate overlapping subsets of genes and thereby 
confer specific neuronal phenotypes. In this model, it is possible that a small family of 
neuroD-related transcription factors acts to establish the identity of a limited number 
of neuronal sub-types and that local inductive events influence the generation of a 

35 higher complexity. Alternatively, it is possible that many additional members of this 
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sub-family are yet to be identified and they may act to directly determine specific 
neuronal attributes. 

EXAMPLE 1 1 

Genomic clones of human neuroDJ, neuroD2 and neuroDS and mouse neuroDS. 
Genomic clones encoding human neuroDl were obtained by probing a human 
fibroblast genomic library with the mouse neuroDl cDNA. Host E. coli strain LE392 
(New England Biolabs) were grown in LB + 10 mM MgS04^ 0.2% maltose overnight 
at 37°C. The cells were harvested and resuspended in 10 mM MgS04 to a final 
OD600 of 2. The resuspended cells were used as hosts for phage infection. The 
optimal volume of phage stock for use in this screening was determined by using 
serial dilutions of the phage stock of a human fibroblast genomic library in lambda 
FIX II (Stratagene®) to infect LE392 cells (New England Biolabs). To obtain 
approximately 50,000 plaques per plate, a 2.5 |il aliquot of the phage stock was used 
to infect 600 |il of the resuspended LE392 cells. The cells were incubated with the 
phage for 15 minutes at 37°C, after which the cells were mixed with 6.5 ml of top 
agar warmed to 50°C. The top agar was plated on solid LB, and incubated overnight 
at 37^C. A total of 22 15-cm plates were prepared in this manner. 

Duplicate plaque lifts were prepared. A first set of Hybond membranes 
(Amersham) were placed onto the plates and allowed to sit for 2 minutes. The initial 
membranes were removed and the duphcate membranes were laid on the plates for 
4 minutes. The membranes were allowed to air dry; then the phage were denatured in 
0.5 M NaOH, 1.5 M NaCl for 7 minutes. The membranes were neutralized with two 
washes in neutralization buffer (1.5M NaCl, 0.5 M Tris, pH 7.2). After 
neutralization, the membranes were crosslinked by exposure to UV, A 1 kb Eco RI- 
Hind HI firagment containing murine neuroDl coding sequences was random primed 
using the Random Priming Kit (Boehringer Mannheim) according to the 
manufacturer's instructions. Membranes were prepared for hybridization by placing 
six membranes in 10 ml of FBI hybridization buffer [100 g polyethylene glycol 800, 
350 ml 20% SDS, 75 ml 20X SSPE; add water to a final volume of one liter] and 
incubating the membranes at 65°C for 10 minutes. After 10 minutes, denatured 
salmon sperm DNA was added to a final concentration of 10 ^g/ml and denatured 
probe was added to a final concentration of 0.25-0,5 x 10 cpm/mJ. The membranes 
were hybridized at 65°C for a period of 8 hours to overnight. After incubation, the 
excess probe was removed, and the membranes were washed first in 2 X SSC, 0.1% 
SDS for 30 minutes at 50''C. The first wash was followed by a final wash in 0.1 X 
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SSC, 0.1% SDS for 30 minutes at 55°C (moderate stringency). Autoradiographs of 
the membranes were prepared. The first screen identified 55 putative positive 
plaques. Thirty-one of the plaques were subjected to a secondary screen using the 
method essentially set forth above. Ten positive clones were identified and subjected 
to a tertiary screen as described above. Eight positive clones were identified after the 
tertiary screen. Of these eight clones, three (14B1, 9F1 and 20A]) were chosen for 
fijrther analysis. Clones 14B1 and 20A1 were deposited at the American Type 
Culture Collection, 12301 Parklawn Drive, Rockville, MD 20852 USA, on 
November 1, 1995. under accession numbers. 69943 and 69942, respectively. 

Phage DNA was prepared fi-om clones 14B1, 9F1, and 20A1. The 14B1 and 
20 A 1 phage DNA were digested with Pst I to isolate the 1.2 kb and 1.6 kb fi-agments, 
respectively, that hybridized to the mouse neuroDI probe. The 9F1 phage DNA was 
digested with Eco RI and Sad to obtain an approximately 2.2 kb fi-agment that 
hybridizes with the mouse neuroDI probe. The fi-agments were each subcloned into 
plasmid BLUESCRIPT SK (Stratagene) that had been linearized with the appropriate 
restriction enzyme(s). The fi-agments were sequenced using Sequenase Version 2.0 
(US Biochemical) and the following primers: the universal primer Ml 3-21, the T7 
primer, and the T3 primer. 

Sequence analysis of clones 9F1 (SEQ ID N0S:8 and 9), and 14B1 (SEQ ID 
NOS:10 and 11) showed a high similarity between the mouse and human coding 
sequences at both the amino acid and nucleotide level. In addition, while clones 9F1 
and 14B1 shared 100% identity in the HLH region at the amino acid level 
(i.e., residues 117-156 in SEQ ID N0:9 and residues 137-176 in SEQ ED NO:Il), 
they diverged in the amino-terminal of the bHLH. This finding strongly suggests that 
14B1 is a member of the neuroD family of genes. Sequence analysis demonstrates 
that clone 9F1 has a high degree of homology throughout the sequence region that 
spans the translation start site to the end of the bHLH region. The 9F1 clone has 
100% identity to mouse neuroDI in the HLH region (i.e., residues 1 17-156 in 
SEQIDN0:9 and residues 117-156 in SEQ ID N0:2), and an overall identity of 
94%. The 14B1 clone also has 100% identity to the HLH region (i e , residues 
137-176 in SEQIDN0:11 and residues 1 17-156 in SEQ ID N0:2), but only 40% 
identity to 9F1 and 39% identity to mouse neuroDI in the amino-terminal region. 
This demonstrates that 9F1 is the human homolog of mouse neuroDI^ whereas the 
strong conservation of the neuroD HLH identifies 14B1 as another member of the 
neuroD HLH subfamily. Human clone 9F1 (represented by SEQ ED NOS: 8 and 9) is 
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referred to as human neuroDJ. Human clone 14B1 is referred to as neuroD2 (SEQ 
ID NOS:10 and 11, and human clone 20A1 is referred to as neuroDS (SEQ ID 
N0S:12and 13). 

A fragment of the human muroD2 gene was used to screen both a mouse 

5 genomic library and an embryonic day 16 mouse cDNA library. An 800 bp Hind III- 

Eag I fragment from the neuroD2 sequences from clone 14B1 was random primed 
32 

with P, and used to screen a 16-day mouse embryo cDNA library essentially as 
described previously. Filters were prehybridized in FBI hybridization buffer (see 
above) at 50°C for 10 minutes. After prehybridization, denatured salmon sperm DNA 

10 was added to a final concentration of 10 iig/ml; denatured probe was added to a final 
concentration of one million cpm/ml. The filters were hybridized at 50°C overnight. 
After incubation, excess probe was removed, and the filters was washed first in 
2 X SSC, 0.1% SDS for 30 minutes at 60°C. Genome clones were obtained and 
characterized. Five independent cDNAs were mapped by restriction endonucleases 

15 and demonstrated identical restriction sites and sequence. One clone, designated 
1.1.1, contained 1 .46 kb of murine neuroD2 cDNA as an Eco RI-Hind III insert. The 
nucleotide sequence and deduced amino acid sequences are shown in SEQ ED 
NOS:16 and 17, respectively. A comparison with the corresponding mouse genomic 
sequence demonstrated that the entire coding region of neuroD2 is contained in the 

20 second exon. 

The mouse neuroD2 cDNA sequence indicated a predicted protein of 382 
amino acids that differs from the major open reading frame in the human neuroD2 
gene at only 9 residues, all in the aminoterminal portion of the protein. The human 
neuroD2 protein was found to have 98% similarity to neuroDl and MATH2 in the 
25 bHLH region and 90% similarity in the 30 amino acids immediately carboxyterminal 
to the bHLH region. Similar to neuroDl and MATH2, neuroD2 contains an 
aminoterminal region rich in glutamate residues that may constitute an acidic 
activation domain, and has other regions of similarity to neuroDl throughout the 
protein. 

30 Mouse neuroD3 was obtained by screening a 129SV mouse genomic library 

cloned in lambda-Dash II (Stratagene®), using a labeled Pst-Pst genomic fragment 
containing the human neuroD coding sequence using conditions essentially as 
described above for selecting mouse neuroD2, with the exception that the 
prehybridization and hybridization were earned out at 55°C and the final wash was 

35 carried out at 50**C 
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Since all identified members of the family of genes related to neuroDl are 
known to have their entire coding sequence in a single exon, the major open reading 
frame (ORF) encoded in the genomic DNA from human and mouse neuroDS were 
determined (SEQ ID NO: 12 and SEQ ID N0:21, respectively). The predicted amino 
5 acid sequences of the mouse and human neuroD3 proteins are based on the major 
ORF in the corresponding genomic DNAs, since cDNAs have not been cloned for 
these genes. The genomic sequence of mouse neuroDS contains a major ORF of 244 
amino acids and the human neuroDS gene an ORF of 237 amino acids that differs 
from the predicted mouse protein at 26 positions. The entire coding region of other 

10 neuroD family members is contained within a single exon, and therefore it is possible 
that the ORF in the neuroDS genomic DNA represents the entire coding region, a 
notion supported by the conservation between mouse and human that extends to the 
stop codon. The major ORF predicts a smaller protein than related neuroD family 
members, and lacks the acidic rich aminoterminal region. The bHLH region has some 

15 elements of the loop that are similar to MATHl, but the overall level of homology in 
the bHLH region is closer to the wewroD-related genes. In contrast to neuroD2, the 
neuroDS protein does not contain significant regions of homology to neuroDl or 
MATH2/NEX-1 outside of the bHLH region and does not have an aminoterminal 
region rich in glutamates or acidic amino acids. 

20 The Genbank accession numbers are: human neuroD2, U58681; mouse 

neuroD2, U58471; human neuroD3, U63842; mouse neuroD3, U63841. 

EXAMPLE 12 
Chromosome mapping of human neuroDl clones. 
FISH karyotyping was performed on fixed metaphase spreads of the microcell 

25 hybrids essentially as described (Trask et al., Am. J, Hum. Genet. 48:1-15, 1991; and 
Brandriflfet al., Genomics 10:75-82, 1991; which are incorporated by reference herein 
in their entirety). NeuroDl sequences were detected using the 9F1 or 20A1 phage 
DNA as probes labeled using digoxigenin-dUTP (Boehringer Mannheim) according to 
the manufacturer's instructions. Phage DNA was biotinylated by random priming 

30 (Gibco/BRL BioNick Kit) and hybridized in situ to denatured metaphase chromosome 
spreads for 24-48 hours. Probes were detected with rhodamine-conjugated antibodies 
to digoxigenin, and chromosomes were counterstained with DAPI (Sigma). Signals 
were viewed through a fluorescence microscope and photographs were taken with 
color slide fibn. FISH analysis indicated clone 9F1 maps to human chromosome 2q, 

35 and clone 20 A 1 maps to human chromosome 5. 
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Chromosome mapping was also carried out on a human/rodent somatic cell 
hybrid panel (National Institute of General Medical Sciences, Camden, NJ). This 
panel consists of DNA isolated from 24 human/rodent somatic cell hybrids retaining 
one human chromosome. For one set of experiments, the panel of DNA's were 
5 digested with Eco RI and electrophoresed on an agarose gel. The DNA was 
transferred to Hybond-N membranes (Amersham). A random primed (Boehringer 
Mannheim) 4 kb Eco Rl-Sac I fragment of clone 9F1 was prepared. The filter was 
prehybridized in 10 ml of FBI hybridization buffer (see above) at 65°C for 10 minutes. 
After prehybridization, denatured salmon ^ sperm DNA was added to a final 

10 concentration of 10 ^ig/ml. denatured probe was added to a final concentration of one 
million cpm/ml. The filter was hybridized at 65°C for a period of 8 hours to 
overnight. After incubation, excess probe was removed, and the filter was washed 
first in 2 X SSC, 0. 1% SDS for 30 minutes at 65°C. The first wash was followed by a 
final stringent wash in 0.1 X SSC, 0.1% SDS for 30 minutes at 65°C. An 

15 autoradiograph of the filter was prepared. Autoradiographs confirmed the FISH 
mapping results. 

In the second experiment, the panel was digested with Pst I, electrophoresed 
and transferred essentially as described above. A random-primed (Boehringer 
Mannheim) 1.6 kb Pst I fragment of clone 20 A 1 was prepared. The membrane was 

20 prehybridized, hybridized with the 20A1 probe and washed as described above. 
Autoradiographs of the Southern filter showed that 20 A 1 mapped to human 
chromosome 5 and confirmed the FISH mapping results. After autoradiography, the 
20A1 -probed membrane was stripped by a wash in 0.5 M NaOH, 1.5 M NaCl. The 
membrane was neutralized in 0.5 M Tris-HCl (pH 7.4), 1.5 M NaCI. The filter was 

25 washed in 0.1 X SSC before prehybridization. A random-primed (Boehringer 
Mannheim) 1.2 kb Pst I fragment of clone 14B1 was prepared. The washed 
membrane was prehybridized and hybridized with the 14B1 probe as described above. 
After washing under the previously described conditions, the membrane was 
autoradiographed. Autoradiographs demonstrated that clone 14B1 mapped to 

30 chromosome 17. 

EXAMPLE 13 
Human neuroDl complementary DNA. 
To obtain a human neuroDI cDNA, one million plaque forming units (pfii) 
were plated onto twenty LB + 10 mM MgS04 (150 mm) plates using the Stratagene 
35 human cDNA library in Lambda ZAP II in the bacterial strain XL-1 Blue (Stratagene). 
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Plating and membrane lifts were performed using standard methods, as described in 
Example 11. After UV cross-linking, the membranes were pre-hybridized in an 
aqueous hybridization solution (1% bovine serum albumin, 1 mM EDTA, 0.5 M 
Na2HP04 (pH 7.4), 7% SDS) at 50°C for two hours. 
5 The mouse neuroDI cDNA insert was prepared by digesting the pKS+ m7a 

RX plasmid with Eco RI and Xho I, and isolating the fragment containing the cDNA 
by electroelution. A probe was made with the cDNA containing fragment by random 
primed synthesis with random hexanucleotides, dGTP, dATP, dTTP, alpha- 
■'^P-labeled dCTP, and Klenow in a buffered solution (25 mM Tris (pH6.9). 50mM 

10 KCl, 5mM MgCl2, ImM DTT). The probe was purified from the unincorporated 
nucleotides on a G-50 Sepharose® column. The purified probe was heat denatured at 
90°C for 3 minutes. 

After prehybridization, the denatured probe was added to the membranes in 
hybridization solution. The membranes were hybridized for 24 hours at SO^C. Excess 

15 probe was removed from the membranes, and the membranes were washed in 0.1 X 
SSC, 0.1% SDS for 20 minutes at SOT. The wash solution was changed five times. 
The membranes were blotted dry and covered with plastic film before being subjected 
to autoradiography. Autoradiography of the filters identified 68 positive clones. The 
clones were plaque-purified and rescreened to obtain 40 pure, positive clones. The 

20 positive clones were screened with a random-primed Pst I fragment from clone 9F1 
(human neuroDI). Twelve positive clones that hybridized with the human neuroDI 
genomic probe were isolated. 

The plasmid vector containing cDNA insert was excised in vivo from the 
lambda phage clone according to the Stratagene methodology. Briefly, eluted phage 

25 and XL-1 Blue cells (200 microliters of OD 600=1) were mixed with R408 helper 
phage provided by Stratagene for 15 minutes at 37°C. Five milliliters of rich bacterial 
grov^^h media (2 X YT, see Sambrook et al., ibid.) was added, and the cultures were 
incubated for 3 hours at 37°C. The tubes were heated at 70*=*C for 20 minutes and 
spun for 5 minutes at 4,000 X g. After centrifugation, 200 microliters of supernatant 

30 were added to the same volume of XL-1 Blue cells (0D=1), and the mixture was 
incubated for 15 minutes at 37°C. after which the bacterial cells were plated onto LB 
plates containing 50 mg/ml ampicillin Each colony was picked and grown for 
sequencing template preparation. The clones were sequenced and compared to the 
human genomic sequence. A fiill length cDNA encoding human neuroDI that was 

35 identical to the 9F1 neuroDI genomic sequence was obtained and designated HC2A. 
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The nucleotide and deduced amino acid sequences are shown in SEQ JD NOS: 14 
and 15, respectively. Clone HC2A was deposited at the American Type Culture 
Collection, 12301 Parklawn Drive, Rockville, MD 20852 USA, on November 1, 
1995, under accession number 69944. 
5 Using a random-primed radiolabeled antisense probe to the mouse neuroDl 

(Boehringer Mannheim), the expression pattern was determined using Northern 
analysis. Filters containing murine RNA from the brain and spinal cords of embryonic 
through adult mice were probed at high stringency and washed in 0.1 X SSC, 0.1% 
SDS at 65°C. Northern analysis showed neuroDl expression in the brain and spinal 

10 cords of mice from embryonic day 12.5 through adult. 

Experiments were conducted also to isolate a cDNA corresponding to mouse 
neuroDS mRNA. Using procedures similar to those described above, a random- 
primed 1.1 kb Pst I fragment from human neuroD3 clone 20 A 1 was prepared and 
used to screen mouse embryo and newborn mouse brain libraries. For unknown 

15 reasons, no positive clones were obtained. Likewise, attempts to clone human 
neuroDS cDNA have been unsuccessful. The difficulty in obtaining neuroDS cDNA 
may be secondary to instability of the construct in the library, since deletions in the 
genomic DNA were conmion during amplification. 

EXAMPLE 14 

20 Construction of knock-out mice 

Knock-out mice in which the murine neuroDl coding sequence was replaced 
with the d-galactosidase gene and the neomycin resistance gene (neo) were generated 
i) to assess the consequences of eliminating the murine neuroDl protein during mouse 
development and ii) to permit examination of the expression pattern of neuroDl in 

25 embryonic mice. Genomic neuroDJ sequences used for these knock-out mice were 
obtained from the 129/Sv mice so that the homologous recombination could take 
place in a congenic background in 129/Sv mouse embryonic stem cells. Several 
murine neuroDl genomic clones were isolated from a genomic library prepared from 
129/Sv mice (Zhuang et al.. Cell 79:875-884, 1994; which is incorporated herein by 

30 reference in its entirety) using the Bam HI-Not I neuroDl cDNA containing fragment 
of pSK+1-83 (Example 2) as a random-primed probe essentially as described in 
Example 11. Piasmid pPNT (Tybulev^acz et al., Cell 65:1153-1163, 1991; which is 
incorporated herein by reference in its entirety) containing the neomycin resistance 
gene (neo; a positive selection marker) and the Herpes simplex virus thymidine kinase 

35 gene (hsv-tk, a negative selection marker) under the control of the PGK promoter 
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provided the vector backbone for the targeting construct. A 1.4 kb 5' murine 
neuroDl genomic fragment together with the 3 kb cytoplasmic P-galactosidase gene 
were inserted between the Eco RI and Xba I sites of the pPNT vector, and an 8 kb 
fragment containing the genomic 3' untranslated sequence of neuroDI was inserted 
5 into the vector backbone between into the Xho 1 and Not I sites. 

To prepare an Eco Rl-Xba I fragment containing neuroDJ promoter 
sequences joined to the P-galactosidase gene, a 1.4 kb Eco RI(vector- 
derived)-Asp 718 fragment containing the 5' untranslated murine neuroDJ genomic 
sequence was ligated to a Hind Ill-XbaJ fragment containing the cytoplasmic 

10 p-galactosidase gene such that the Asp 718 and Hind III sites were destroyed. The 
resulting approximately 4.4 kb Eco Rl-Xba I fragment, containing the 5' neuroDJ 
genomic sequence (including the neuroDJ promoter) and the P-galactosidase gene in 
the same transcriptional orientation, was inserted into Eco Rl-Xba I linearized pPNT 
to yield the plasmid pPNT/5*+p-gal. A neuroDJ fragment containing 3' untranslated 

15 DNA was obtained from a murine neuroDJ genomic clone that had been digested 
with Spe I and Not I(vector-derived) to yield an 8 kb fragment. To obtain a 5' Xho I 
site, the 8 kb fragment was inserted into Spe I-Not I linearized pBlueskriptSK+ 
(Stratagene), and the resulting plasmid digested with Xho I and Not I to obtain the 
8 kb neuroDJ 3' genomic fragment. The Xho I-Not I fragment was inserted into Xho 

20 I-Not I linearized pPNT/5'+p-gal to yield the neuroDJ targeting vector. The final 
construct contained the 5' neuroDJ fragment, the P-galactosidase gene, and the 
3' genomic neuroDJ fragment in the same orientation, and the hsv-tk and neomycin 
resistance genes in the opposite orientation. 

The targeting construct was transfected by electroporation into mouse 

25 embryonic stem (ES) cells. A 129/Sv derived ES cell line, AK-7 described by Zhuang 
et al. (ibid.) was used for electroporation. These ES cells were routinely cultured on 
mitomycin C-treated (Sigma) SNL 76/7 cells (feeder cells) as described by McMahon 
and Bradley (Cell 62:1073-1085, 1990; which is incorporated herein by reference in 
its entirety) in culture medium containing high glucose DMEM supplemented with 

30 15% fetal bovine serum (Hyclone) and 0.1 |iM P-mercaptoethanol. To prepare the 
targeting construct for transfection, 25 ^ig of the targeting construct was linearized by 
digestion with Not I, phenol-chloroform extracted, and ethanol precipitated. The 
linearized vector was then electroporated into 1-2 x 10 AK-7 (ES) cells. The 
electroporated cells were seeded onto three 10-cm plates, with one plate receiving 

35 50% of the electroporated cells and the remaining two plates each receiving 25% of 
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the electroporated cells. After 24 hours, G418 was added to each of the plates to a 
final concentration of 150 |ig/ml. After an additional 24 hours, gancyclovir was 
added to a final concentration of 0.2 |iM to the 50% plate and one of the 25% plates. 
The third plate containing 25% of the electroporated cells was subjected to only G418 
5 selection to assess the efficiency of gancyclovir selection. The culture medium for 
each plate was changed every day for the first few days, and then changed as needed 
after selection had occurred. After 10 days of selection, a portion of each colony was 
picked microscopically with a drawn micropipette, and was directly analyzed by PGR 
as described by Joyner, et al. {Nature 338:153-156, 1989; which is incorporated 

10 herein by reference in its entirety). Briefly, PGR amplification was performed as 
described (Kogan et al., New England J, Med, 317:985-990, 1987; which is 
incorporated herein by reference in its entirety) using 40 cycles of 93**G for 
30 seconds, 57**G for 30 seconds, and 65°C for 3 minutes. To detect the wild-type 
allele, primers JL34 and JL36 (SEQ ID N0S:18 and 19, respectively) were used in 

15 the PGR reaction, to detect the mutant neuroDI allele, primers JL34 and JL40 (SEQ 
ED N0S:18 and 20, respectively) were used in the PGR reaction. Positive colonies, 
identified by PGR, were subcloned into 4-well plates, expanded into 60 mm plates and 
fi^ozen into 2-3 ampules. 

Among the clones that were selected for both G418-resistance (positive 

20 selection for neo gene expression) and gancyclovir-resistance (negative selection for 
hsv'ik gene expression), 10% of the population contained correctly targeted 
integration of the vector into the murine neuroDJ locus (an overall 10% targeting 
fi*equency). The negative selection provided 4-8 fold enrichment for homologous 
recombination events. 

25 To generate chimeric mice, each positive clone was thawed and passaged once 

on feeder cells. The transfected cells were tiypsinized into single cells, and 
blastocysts obtained fi^om G57BL/6J mice were injected with approximately 15 cells. 
The injected blastocysts were then implanted into pseudopregnant mice (G57BL/6J x 
GBA). Four male chimeras arose fi'om the injected blastocysts (AK-71, AK-72, 

30 AK-74 and AK-75). The male chimeras AK-71 and AK-72 gave germ-line 
transmission at a high rate as determined by the fi-equency of agouti coat color 
transmission to their offspring (Fl) in a cross with G57BLy6J female mice. Since 50% 
of the agouti coat color oflFspring (Fl) should represent heterozygous mutants, their 
genotypes were determined by Southern blot analysis. Briefly, genomic DNA 

35 prepared fi:-om tail biopsies was digested with Eco RI and probed with the 1.4 kb 5* 
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genomic sequence used to make the targeting construct. This probe detects a 4 kb 
Eco RI fragment from the wild-type allele and a 6.3 kb Eco RI fragment from the 
mutant allele. Thereft)re, a Southern analysis would show a single 4 kb band for a 
wild-type mouse, 4 kb and 6.3 kb fragments for a heterozygous mouse, and a single 
5 6.3 kb band for a homozygous mutant mouse. The resuhing offspring (Fl) 
heterozygous (+/-) mice, were mated with sibling heterozygous mice to give rise to 
the homozygous (-A) mutant mice. 

To study neuroDJ expression patterns in embryonic mice, chimeric mice or Fl 
heterozygous progeny from the chimera x C57B/6J mating were crossed with 

10 C57B/6J. Litters resulting from these crosses were harvested from pregnant females 
and stained for P-galactosidase activity. The embryos were dissected away from all 
the extra-embryonic tissue and the yolk sac was reserved for DNA analysis. The 
embryos were fixed for one hour in a fbcing solution (0.1 M phosphate buffer 
containing 0.2% glutaraldehyde, 2% formaldehyde, 5 mM EGTA (pH 7.3), 2 mM 

15 MgCl2). The fixing solution was removed by three thirty-minute rinses with rinse 
solution (0.1 M phosphate buffer (pH 7.3) containing 2 mM MgCl2> 0.1% sodium 
deoxycholate, 0.2% NP-40). The fixed embryos were stained overnight in the dark in 
rinse solution containing 1 mg/ml X-gal, 5 mM sodium ferricyanide, 5 mM sodium 
ferrocyanide. After staining, the embryos were rinsed with PBS and stored in the 

20 fixing solution before preparation for examination. Examination of stained tissue 
from fetal and postnatal mice heterozygous for the mutation confirmed the neuroDJ 
expression pattern in neuronal cells demonstrated previously by in situ hybridization 
(Example 4), and also demonstrated neuroDJ expression in the pancreas and 
gastrointestinal tract. 

25 Blood glucose levels were detected using PRECISION QID blood glucose 

test strips and a PRECISION QID blood glucose sensor (Medisens Inc., Waltham, 
MA) according to the manufacturer's instruction. A tissue sample was taken for DNA 
analysis and the pups were fixed for further histological examination. Blood glucose 
levels in mice homozygous for the mutation {neuroDl) had blood glucose levels 

30 between 2 and 3 times higher than the blood glucose level of wild-type mice. 
Heterozygous mutants exhibited similar blood glucose levels as wild-type mice. Mice 
that were homozygous for the mutation (lacking neuroDl) had diabetes as 
demonstrated by high blood glucose levels and died by day four; some homozygous 
mice died at birth. 
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EXAMPLE 15 
NeuroDI expression and activity in PC 1 2 and P19 
embryonic carcinoma cells 
Murine PC 12 pheochromacytoma cells differentiate into neurons in tissue 
5 culture in the presence of appropriate inducers, i.e., nerve growth factor. Neither 
induced nor non-induced murine PC 12 cells expressed neuroDJ transcripts, nor did 
control 3T3 fibroblasts produce detectable levels of neuroDJ transcription products. 

PI 9 cells are a well characterized mouse embryonic carcinoma cell line with 
the ability to dififerentiate into numerous cell types, including skeletal and cardiac 
10 muscle, or neurons and glia following treatment with dimethylsulfoxide (DMSO) or 
retinoic acid (RA) (Jones- Villeneuve et al., Molec. & Cell Biol. 3:2271-2279, 1983), 
respectively. To determine whether PI 9 cells expressed endogenous neuroD genes 
during neuronal differentiation, RNA expression was analyzed for neuroDJ, neuroD2, 
and neuroDS in both uninduced and induced PI 9 cells. To induce the formation of 
15 neurons, PI 9 cells were cultured as aggregates in Petri dishes in the presence of 
retinoic acid for four days. The aggregates were then plated into tissue culture dishes 
in the absence of retinoic acid and neuronal differentiation occurred during a five day 
period, as evidenced by the formation of neurofilament positive process bearing cells. 
NeuroDI mRNA was most abundant after the cells were aggregated and 
20 treated with RA for 4 days, and continued to be expressed at decreased levels during 
the period of neuronal differentiation. NeuroDI was not detected during the period of 
RA induction, but became abundant during the period of neuronal differentiation. 
Both neuroDl and neuroD2 signals were modestly enhanced when the differentiated 
PI 9 cultures were grown in the presence of Ara-C which eliminates some of the non- 
25 neuronal dividing cells, suggesting that the neuroDl and neuroD2 genes are 
preferentially expressed in the post-mitotic cell population but further experiments will 
be necessary to prove this point. NeuroDS was first detected after two days of 
induction, and was most abundant after 4 days of induction), however, unlike 
neuroDJ, neuroDS mRNA was not detected at the later, more differentiated, time 
30 points. Therefore, the temporal expression pattern of neuroDJ, neuroD2, and 
neuroDS in differentiating PI 9 cells was similar to that seen during embryonic 
development: a peak of neuroDS expression at the time of neuronal commitment and 
early neurogenesis, early and persistent expression of neuroDl, and slightly later and 
persistent expression of neuroD2, Hence, PI 9 cells are potentially useful in screening 
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assays for identifying inducers of neuroDl expression that may stimulate nerve 
regeneration and differentiation of neural tumor cells. 

NeuroDl and neuroDl are both expressed in neurons and both can induce 
neurogenesis when expressed in fi-og embryos. To determine if they have the ability 
5 to activate similar target genes, expression vectors were constructed driving the 
human neuroDl or neuroD2 coding regions from a simian cytomegalovirus promoter; 
these vectors are pCS2-hNDl and pCS2-hND2, whose construction is described in 
Example 10. The activity of neuroDl and neuroD2 was assayed on reporter 
constructs co-transfected into PI 9 cells. Other members of the neuroD family have 

10 been shown to bind consensus E-box sequences in vitro. Gel shift assays have 
demonstrated that MATH-1 and NEX-l/MATH-2 bind the consensus E-box site 
CAGGTG as a heterodimer with the E47 protein, and activate the transcription of 
reporter constructs (Akazawa et al., 1995; Bartholoma, A. and K. A. Nave., 1994; 
Shimizu et al., 1995). In vitro gel shifl assays demonstrated that neuroDl and 

15 neuroD2 proteins can bind to an oligo containing the core E-box CACCTG as a 
heterodimer with an E-protein. Therefore, we tested the ability of neuroDl and 
neuroD2 proteins to activate transcription of a simple reporter construct composed of 
a multimerized E-box with the same core sequence and the minimal promoter from 
the thymidine kinase gene driving luciferase, p4RTK-luc. 

20 P19 cells to be transfected were cultured in minimal essential medium alpha 

supplemented with 10% fetal bovine serum. Transfections were performed as 
previously described (Tamura, M. and M. Noda., 1994), using a BBS calcium 
chloride precipitation. Forty-eight hours after transfection, the cells were harvested 
and assayed for luciferase and lacZ. Construction of the expression vectors pCS2- 

25 hNDl and pCS2-hND2 were as described in Example 10. The pGAP43-luc 
construct, a neuronal specific promoter construct that is upregulated in vivo in post- 
mitotic, terminally differentiating neurons (Nedivi et al., J. Neurosci. 12:691-704, 
1992), contained the GAP43 760 base pair promoter region driving luciferase in a 
pGL2 vector modified to contain a poly-A site upstream of the multiple cloning site, 

30 and was the generous gift of Pate Skene and Joseph Weber. The pND2-luciderase 
construct was made by cloning a Ikb fragment of mouse neuroDl sequence 
terminating in the first exon, cloning this fragment into the pGL3 luciferase vector. 
The p4RTK-luciferase construct was made by placing the 4RTK region firom HindlQ 
to Xhol of the p4RTK-CAT vector (Weintraub et al., Proc. Natl Acad, Sci. 

35 87:5623-5627, 1990) into the promoterless luciferase vector. Luciferase assays were 
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perfonned according to Current Protocols in Molecular Biology (Brasier, A. R., John 
Wiley & Sons, New York, 1989). 

When P19 cells were transfected as described above, it was observed that 
cotransfection with either pCS2-NDl or pCS2-ND2 modestly increased the level of 
5 activity from p4RTK-luc in P19 cells, increasing the activity between two and four- 
fold. 

Additional reporter constructs were tested in PI 9 cells to determine whether 
the neuroD and neuroD2 proteins had different transcriptional activation potentials. 
Tests were conducted to determine the ability of pCS2-NDl and pCS2-ND2 to 

10 transactivate the luciferase reporter construct, pGAP43-luciferase. In contrast to the 
simple E-box driven reporter, pCS2-NDl did not show significant transactivation of 
the pGAP43 -luciferase, while pCS2-ND2 induced expression from this construct by 
approximately 4-fold over the basal activity. 

The myogenic bHLH proteins show auto- and cross-regulation, and 

15 expression of NEX-l/MATH-2 has been shown to activate a reporter driven by the 
NEX'}/K4ATH'2 promoter (Bartholoma and Nave, 1994). To determine if neuroDl 
or neuroD2 could activate a construct containing the neuroD2 promoter, we made a 
construct that contained a one kilobase fragment upstream of the mouse nevroDl 
gene, terminating in the first exon, driving the luciferase reporter gene. P19 cells 

20 were co-transfected with this pND2-luc reporter construct and the neuroD expression 
vectors. Both pCS2-NDl and pCS2-ND2 transactivated this reporter construct, 
suggesting that neuroD2 may be auto-regulated and cross-regulated by other 
members of the neuroD family, in a manner analogous to the regulation of the 
myogenic bHLH genes. 

25 Together these transfection experiments demonstrate that neuroD 1 and 

neuroD2 proteins can both activate some target genes, such as a multimerized E-box 
reporter and the neuroDl promoter; whereas the reporter construct driven by the 
GAP43 promoter seems to be preferentially activated by neuroD2. At this time the 
amount of protein made from each vector following transfection cannot be 

30 quantitated, and interpretations rely on the relative activity of the reporter constructs. 
Further analysis of the specificity of neuroD and neuroD2 will require identifying 
specific cis acting sequences in these reporters that mediate activity. 
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EXAMPLE 16 

In situ localization of neuroDi and neuroD2 RNA in adult mouse brain 
To address the question of whether neuroDi and neuroD2 were expressed in 
neurons in the adult mouse brain and whether they were expressed in the same cells, 
5 in situ hybridizations were performed using "S-UTP labeled RNA probes. Sections 
of adult mouse brain were hybridized to anti-sense probes derived from the mouse 
neuroDI and neuroD2 cDNA fragments using T3 and T7 generated transcripts for 
sense and anti-sense probes, and incorporating "S-UTP label. Frozen 4-5 micron 
sagittal sections of adult mouse brain were cut, placed on Fisher Superfrost slides, and 
10 frozen at -SCC. Hybridization to ^^SUTP labeled probes and autoradiography was 
performed according to Masters et al. {J. Neurosci. 14:5844-5857, 1994), which is 
hereby incorporated by reference. After washing to remove unhybridized probe, 
sections were coated with liquid photographic emulsion. After development of the 
emulsion, dark field optics illuminated the silver grains as white spots at magnification 
15 X160. 

In the cerebellum, neuroDi was easily detected in the granule layer, whereas 
the neuroDI signal was less intense in this region and was largely restricted to the 
region of the Purkinje cells. In contrast, the neuroDl and neuroD2 signals in the 
pyramidal cells and dentate gyrus of the hippocampus were easily detected. The 

20 neuroD2 probe hybridized preferentially to the region of the Purkinje cell layer. 
These results demonstrate that neuroDJ and neuroD2 are expressed in neuronal 
populations in the mature nervous system, and that their relative level of expression 
varies among neuronal populations. 

From the foregoing it will be appreciated that, although specific embodiments 

25 of the invention have been described herein for purposes of illustration, various 
modification may be made without deviating from the spirit and scope of the 
invention. 
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(i) APPLICANTS: Weintraub, Harold M. 

Lee, Jacqueline E. 
Tapscott, Stephen J. 
Hollenberg, Stanley M. 
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(2) INFORMATION FOR SEQ ID N0:1: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2089 base pairs 

(B) TYPE: nucleic acid 
(CI STRANDEDNESS: double 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 
(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Mus mus cuius 
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(A) NAME/ KEY: CDS 

(B) LOCATION: 229., 1302 

(xi) SEQUENCE DESCRIPTION; SEQ ID N0:1: 
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ACTACGCAGC ACCGAGGTAC AGACACGCCA GCATGAAGCA CTGCGTTTAA CTTTTCCTGG 60 

AGGCATCCAT TTTGCAGTGG ACTCCTGTGT ATTTCTATTT GTGTGCATTT CTGTAGGATT 120 

AGGGAGAGGG AGCTGAAGGC TTATCCAGCT TTTAAATATA GCGGGTGGAT TTCCCCCCCT 180 

TTCTTCTTCT GCTTGCCTCT CTCCCTGTTC AATACAGGAA GTGGAAAC ATG ACC AAA 237 

Met Thr Lys 
1 

TCA TAG AGC GAG AGC GGG CTG ATG GGC GAG CCT CAG CCC CAA GGT CCC 285 
Ser Tyr Ser Glu Ser Gly Leu Met Gly Glu Pro Gin Pro Gin Gly Pro 
5 10 15 

CCA AGC TGG ACA GAT GAG TGT CTC AGT TCT CAG GAC GAG GAA CAC GAG 333 
Pro Ser Trp Thr Asp Glu Cys Leu Ser Ser Gin Asp Glu Glu His Glu 
20 25 30 35 

GCA GAC AAG AAA GAG GAC GAG CTT GAA GCC ATG AAT GCA GAG GAG GAC 381 
Ala Asp Lys Lys Glu Asp Glu Leu Glu Ala Met Asn Ala Glu Glu Asp 
40 45 50 

TCT CTG AGA AAC GGG GGA GAG GAG GAG GAG GAA GAT GAG GAT CTA GAG 429 
Ser Leu Arg Asn Gly Gly Glu Glu Glu Glu Glu Asp Glu Asp Leu Glu 
55 60 65 

GAA GAG GAG GAA GAA GAA GAG GAG GAG GAG GAT CAA AAG CCC AAG AGA 477 
Glu Glu Glu Glu Glu Glu Glu Glu Glu Glu Asp Gin Lys Pro Lys Arq 
70 75 80 

CGG GGT CCC AAA AAG AAA AAG ATG ACC AAG GCG CGC CTA GAA CGT TTT 525 
Arg Gly Pro Lys Lys Lys Lys Met Thr Lys Ala Arg Leu Glu Arg Phe 
85 90 95 

AAA TTA AGG CGC ATG AAG GCC AAC GCC CGC GAG CGG AAC CGC ATG CAC 57 3 

Lys Leu Arg Arg Met Lys Ala Asn Ala Arg Glu Arg Asn Arg Met His 
105 110 115 

GGG CTG AAC GCG GCG CTG GAC AAC CTG CGC AAG GTG GTA CCT TGC TAC 621 
Gly Leu Asn Ala Ala Leu Asp Asn Leu Arg Lys Val Val Pro Cys Tyr 
120 125 130 

TCC AAG ACC CAG AAA CTG TCT AAA ATA GAG ACA CTG CGC TTG GCC AAG 669 
Ser Lys Thr Gin Lys Leu Ser Lys lie Glu Thr Leu Arg Leu Ala Lys 
135 140 145 

AAC TAC ATC TGG GCT CTG TCA GAG ATC CTG CGC TCA GGC AAA AGC CCT 717 
Asn Tyr lie Trp Ala Leu Ser Glu lie Leu Arg Ser Gly Lys Ser Pro 
150 155 160 

GAT CTG GTC TCC TTC GTA CAG ACG CTC TGC AAA GGT TTG TCC CAG CCC 765 
Asp Leu Val Ser Phe Val Gin Thr Leu Cys Lys Gly Leu Ser Gin Pro 
165 170 175 

ACT ACC AAT TTG GTC GCC GGC TGC CTG CAG CTC AAC CCT CGG ACT TTC 813 
Thr Thr Asn Leu Val Ala Gly Cys Leu Gin Leu Asn Pro Arg Thr Phe 
180 1B5 190 195 
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TTG CCT GAG CAG AAC CCG GAC ATG CCC CCG CAT CTG CCA ACC GCC AGC 661 
Leu Pro Glu Gin Asn Pro Asp Met Pro Pro His Leu Pro Thr Ala Ser 
200 205 210 

GCT TCC TTC CCG GTG CAT CCC TAC TCC TAC CAG TCC CCT GGA CTG CCC 909 
Ala Ser Phe Pro Val His Pro Tyr Ser Tyr Gin Ser Pro Gly Leu Pro 
215 220 225 

AGC CCG CCC TAC GGC ACC ATG GAC AGC TCC CAC GTC TTC CAC GTC AAG 957 
Ser Pro Pro Tyr Gly Thr Met Asp Ser Ser His Val Phe His Val Lys 
230 235 240 

CCG CCG CCA CAC GCC TAC AGC GCA GCT CTG GAG CCC TTC TTT GAA AGC 1005 
Pro Pro Pro His Ala Tyr Ser Ala Ala Leu Glu Pro Phe Phe Glu Ser 
245 250 255 

CCC CTA ACT GAC TGC ACC AGC CCT TCC TTT GAC GGA CCC CTC AGC CCG 1053 
Pro Leu Thr Asp Cys Thr Ser Pro Ser Phe Asp Gly Pro Leu Ser Pro 
260 265 270 275 

CCG CTC AGC ATC AAT GGC AAC TTC TCT TTC AAA CAC GAA CCA TCC GCC 1101 
Pro Leu Ser He Asn Gly Asn Phe Ser Phe Lys His Glu Pro Ser Ala 
280 285 290 

GAG TTT GAA AAA AAT TAT GCC TTT ACC ATG CAC TAC CCT GCA GCG ACG 1149 
Glu Phe Glu Lys Asn Tyr Ala Phe Thr Met His Tyr Pro Ala Ala Thr 
295 300 305 

CTG GCA GGG CCC CAA AGC CAC GGA TCA ATC TTC TCT TCC GGT GCC GCT 1197 
Leu Ala Gly Pro Gin Ser His Gly Ser He Phe Ser Ser Gly Ala Ala 
310 315 320 

GCC CCT CGC TGC GAG ATC CCC ATA GAC AAC ATT ATG TCT TTC GAT AGC 1245 
Ala Pro Arg Cys Glu He Pro lie Asp Asn He Met Ser Phe Asp Ser 
325 330 335 

CAT TCG CAT CAT GAG CGA GTC ATG AGT GCC CAG CTT AAT GCC ATC TTT 12 93 

His Ser His His Glu Arg Val Met Ser Ala Gin Leu Asn Ala He Phe 
340 345 350 355 

CAC GAT TAGAGGGCAC GTCAGTTTCA CTATTCCCGG GAAACGAATC CACTGTGCGT 1349 
His Asp 



ACAGTGACTG TCCTGTTTAC AGAAGGCAGC CCTTTTGCTA AGATTGCTGC AAAGTGCAAA 1409 

TACTCAAAGC TTCAAGTGAT ATATGTATTT ATTGTCGTTA CTGCCTTTGG AAGAAACAGG 1469 

GGATCAAAGT TCCTGTTCAC CTTATGTATT GTTTTCTATA GCTCTTCTAT TTTAAAAATA 152 9 

ATAATACAGT AAAGTAAAAA AGAAAATGTG TACCACGAAT TTCGTGTAGC TGTATTCAGA 1589 

TCGTATTAAT TATCTGATCG GGATAAAAAA AATCACAAGC AATAATTAGG ATCTATGCAA 164 9 

TTTTTAAACT AGTAATGGGC CAATTAAAAT ATATATAAAT ATATATTTTT CAACCAGCAT 1709 

TTTACTACCT GTGACCTTTC CCATGCTGAA TTATTTTGTT GTGATTTTGT ACAGAATTTT 17 69 
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TAATGACTTT TTATAACGTG GATTTCCTAT TTTAAAACCA TGCAGCTTCA TCAATTTTTA 182 9 

TACATATCAG AAAAGTAGAA TTATATCTAA TTTATACAAA ATAATTTAAC TAATTTAAAC 188 9 

CAGCAGAAAA GTGCTTAGAA AGTTATTGCG TTGCCTTAGC ACTTCTTTCT TCTCTAATTG 194 9 

TAAAAAAGAA AAAAAAAAAA AAAAAACTCG AGGGGGGGCC CGGTACCCAG CTTTTGTTCC 2009 

CTTTAGTGAG GGTTAATTGC GCGCTTGGCG TAATCATGGT CATAGCTGTT TCCTGTGTGA 2069 

ATTGTTATCC GCTCACAATT 2089 

(2) INFORMATION FOR SEQ ID NO: 2: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 357 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 2: 

Met Thr Lys Ser Tyr Ser Glu Ser Gly Leu Met Gly Glu Pro Gin Pro 
15 10 15 

Gin Gly Pro Pro Ser Trp Thr Asp Glu Cys Leu Ser Ser Gin Asp Glu 
20 25 30 

Glu His Glu Ala Asp Lys Lys Glu Asp Glu Leu Glu Ala Met Asn Ala 
35 40 45 

Glu Glu Asp Ser Leu Arg Asn Gly Gly Glu Glu Glu Glu Glu Asp Glu 
50 55 60 

Asp Leu Glu Glu Glu Glu Glu Glu Glu Glu Glu Glu Glu Asp Gin Lys 
65 70 75 80 

Pro Lys Arg Arg Gly Pro Lys Lys Lys Lys Met Thr Lys Ala Arg Leu 
85 90 95 

Glu Arg Phe Lys Leu Arg Arg Met Lys Ala Asn Ala Arg Glu Arg Asn 
100 105 110 

Arg Met His Gly Leu Asn Ala Ala Leu Asp Asn Leu Arg Lys Val Val 
115 120 125 

Pro Cys Tyr Ser Lys Thr Gin Lys Leu Ser Lys lie Glu Thr Leu Arg 
130 135 140 

Leu Ala Lys Asn Tyr He Trp Ala Leu Ser Glu He Leu Arg Ser Gly 
145 150 155 160 

Lys Ser Pro Asp Leu Val Ser Phe Val Gin Thr Leu Cys Lys Gly Leu 
165 170 175 

Ser Gin Pro Thr Thr Asn Leu Val Ala Gly Cys Leu Gin Leu Asn Pro 
180 185 190 

Arg Thr Phe Leu Pro Glu Gin Asn Pro Asp Met Pro Pro His Leu Pro 
195 200 205 
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Thr Ala Ser Ala Ser Phe Pro Val His Pro Tyr Ser Tyr Gin Ser Pro 
210 215 220 

Gly Leu Pro Ser Pro Pro Tyr Gly Thr Met Asp Ser Ser His Val Phe 
225 230 235 240 

His Val Lys Pro Pro Pro His Ala Tyr Ser Ala Ala Leu Glu Pro Phe 
245 250 255 

Phe Glu Ser Pro Leu Thr Asp Cys Thr Ser Pro Ser Phe Asp Gly Pro 
260 265 270 

Leu Ser Pro Pro Leu Ser He Asn Gly Asn Phe Ser Phe Lys His Glu 
275 280 285 

Pro Ser Ala Glu Phe Glu Lys Asn Tyr Ala Phe Thr Met His Tyr Pro 
290 295 300 

Ala Ala Thr Leu Ala Gly Pro Gin Ser His Gly Ser He Phe Ser Ser 
305 310 315 320 

Gly Ala Ala Ala Pro Arg Cys Glu He Pro He Asp Asn He Met Ser 
325 330 335 

Phe Asp Ser His Ser His His Glu Arg Val Met Ser Ala Gin Leu Asn 
340 345 350 



Ala He Phe His Asp 
355 



(2) INFOPMATION FOR SEQ ID NO: 3: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1275 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 
(ii) MOLECULE TYPE: cDNA 
(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Xenopus laevis 
(ix) FEATURE: 

(A) NAME/KEY: CDS 

(B) LOCATION: 25.. 1083 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 3: 

ATTTCCTTTC TCCAGATCTA AAAA ATG ACC AAA TCG TAT GGA GAG AAT GGG 51 

Met Thr Lys Ser Tyr Gly Glu Asn Gly 
1 5 

CTG ATC CTG GCC GAG ACT CCG GGC TGC AGA GGA TGG GTG GAC GAA TGC 99 
Leu He Leu Ala Glu Thr Pro Gly Cys Arg Gly Trp Val Asp Glu Cys 
10 15 20 25 



CTG AGT TCT CAG GAT GAA AAC GAT CTG GAG AAA AAG GAG GGA GAG TTG 
Leu Ser Ser Gin Asp Glu Asn Asp Leu Glu Lys Lys Glu Gly Glu Leu 
30 35 40 



147 
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ATG AAA GAA GAC GAT GAA GAC TCA CTG AAT CAT CAC AAT GGA GAG GAG 195 
Met Lys Glu Asp Asp Glu Asp Ser Leu Asn His His Asn Gly Giu Glu 
45 50 55 

AAC GAG GAA GAG GAT GAA GGG GAT GAG GAG GAG GAG GAC GAT GAA GAT 243 
Asn Glu Glu Glu Asp Glu Gly Asp Glu Glu Glu Glu Asp Asp Glu Asp 
60 65 70 

GAT GAT GAG GAT GAC GAC CAG AAA CCC AAA AGG CGA GGA CCG AAA AAG 2 91 

Asp Asp Glu Asp Asp Asp Gin Lys Pro Lys Arg Arg Gly Pro Lys Lys 
75 80 85 

AAA AAA ATG ACG AAA GCC CGG GTG GAG CGA TTT AAA GTG AGA CGC ATG 339 
Lys Lys Met Thr Lys Ala Arg Val Glu Arg Phe Lys Val Arg Arg Met 
90 95 ipo 105 

AAG GCA AAC GCC AGG GAG AGG AAT CGC ATG CAC GGA CTC AAC GAT GCC 387 
Lys Ala Asn Ala Arg Glu Arg Asn Arg Met His Gly Leu Asn Asp Ala 
110 115 120 

CTG GAC AGT CTG CGC AAA GTT GTG CCC TGC TAG TCC AAA ACA CAA AAG 4 35 

Leu Asp Ser Leu Arg Lys Val Val Pro Cys Tyr Ser Lys Thr Gin Lys 
125 130 135 

TTG TCT AAG ATT GAA ACT CTG CGC CTG GCT AAG AAC TAC ATC TGG GCT 4 83 

Leu Ser Lys He Glu Thr Leu Arg Leu Ala Lys Asn Tyr He Trp Ala 
140 145 150 

CTT TCT GAG ATT TTA AGG TCC GGC AAA AGC CCA GAC CTG GTG TCC TTT 531 
Leu Ser Glu He Leu Arg Ser Gly Lys Ser Pro Asp Leu Val Ser Phe 
155 160 165 

GTA CAA ACT CTC TGC AAA GGT TTG TCG CAG CCC ACC ACC AAT CTA GTA 579 
Val Gin Thr Leu Cys Lys Gly Leu Ser Gin Pro Thr Thr Asn Leu Val 
170 175 180 185 

GCG GGG TGT CTG CAG CTG AAC CCC AGA ACT TTC CTT CCT GAG CAG AGT 627 
Ala Gly Cys Leu Gin Leu Asn Pro Arg Thr Phe Leu Pro Glu Gin Ser 
190 195 200 

CAG GAC ATC CAG TCG CAC ATG CAA ACA GCG AGC TCT TCC TTC CCT CTG 675 
Gin Asp lie Gin Ser His Met Gin Thr Ala Ser Ser Ser Phe Pro Leu 
205 210 215 

CAG GGC TAT CCC TAT CAG TCC CCT GGT CTT CCC AGT CCC CCC TAT GGT 723 
Gin Gly Tyr Pro Tyr Gin Ser Pro Gly Leu Pro Ser Pro Pro Tyr Gly 
220 225 230 

ACC ATG GAC AGC TCC CAT GTA TTC CAC GTC AAG CCT CAC TCC TAT GGG 771 
Thr Met Asp Ser Ser His Val Phe His Val Lys Pro His Ser Tyr Gly 
235 240 245 

GCG GCC CTG GAG CCT TTC TTT GAC AGC AGC ACC GTC ACT GAG TGT ACC 819 
Ala Ala Leu Glu Pro Phe Phe Asp Ser Ser Thr Val Thr Glu Cys Thr 
250 255 260 265 
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AGC CCG TCA TTC GAT GGT CCC CTG AGC CCA CCC CTT AGT GTT AAT GGG 8 67 

Ser Pro Ser Phe Asp Gly Pro Leu Set Pro Pro Leu Ser Val Asn Gly 
270 275 280 

AAC TTT ACT TTT AAA CAC GAG CAT TCG GAG TAT GAT AAA AAT TAC ACG 915 
Asn Phe Thr Phe Lys His Glu His Ser Glu Tyr Asp Lys Asn Tyr Thr 
285 290 295 

TTC ACT ATG CAC TAT CCT GCA GCC ACT ATA TCC CAG GGC CAC GGA CCA 963 
Phe Thr Met His Tyr Pro Ala Ala Thr lie Ser Gin Gly His Gly Pro 
300 305 310 

TTG TTC TCC ACG GGG GGA CCA CGC TGT GAA ATC CCA ATA GAC ACC ATC 1011 
Leu Phe Ser Thr Gly Gly Pro Arg Cys Glu lie Pro lie Asp Thr lie 
315 320 325 

ATG TCC TAT GAC GGT CAC TCC CAC CAT GAA AGA GTC ATG AGT GCC CAG 1059 
Met Ser Tyr Asp Gly His Ser His His Glu Arg Val Met Ser Ala Gin 
330 335 340 345 

CTA AAT GCC ATC TTT CAT GAT TAACCCTTGG AAGATCAAAA CAACTGACTG 1110 
Leu Asn Ala He Phe His Asp 
350 

TGCATTGCCA GGACTGTCTT GTTTACCAAG GGCAGACACG TGGGTAGTAA AAGTGCAAAT 1170 

GCCCCACTCT GGGGCTGTAA CAAACTTGAT CTTGTCCTGC CTTTAGATAT GGGGAAACCT 12 30 

AATGTATTAA TTCCCACCTC CTTCCAATCG ACACTCCTTT AAATT 1275 

(2) INFORMATION FOR SEQ ID NO: 4: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 352 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 4: 

Met Thr Lys Ser Tyr Gly Glu Asn Gly Leu He Leu Ala Glu Thr Pro 
15 10 15 

Gly Cys Arg Gly Trp Val Asp Glu Cys Leu Ser Ser Gin Asp Glu Asn 
20 25 30 

Asp Leu Glu Lys Lys Glu Gly Glu Leu Met Lys Glu Asp Asp Glu Asp 
35 40 45 

Ser Leu Asn His His Asn Gly Glu Glu Asn Glu Glu Glu Asp Glu Gly 
50 55 60 

Asp Glu Glu Glu Glu Asp Asp Glu Asp Asp Asp Glu Asp Asp Asp Gin 
65 70 75 80 

Lys Pro Lys Arg Arg Gly Pro Lys Lys Lys Lys Met Thr Lys Ala Arg 
85 90 95 

Val Glu Arg Phe Lys Val Arg Arg Met Lys Ala Asn Ala Arg Glu Arg 
100 105 110 
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Asn Arg Met His Gly Leu Asn Asp Ala Leu Asp Ser Leu Arg Lys Val 
115 120 125 

Val Pro Cys Tyr Ser Lys Thr Gin Lys Leu Ser Lys He Glu Thr Leu 
130 135 140 

Arg Leu Ala Lys Asn Tyr He Trp Ala Leu Ser Glu He Leu Arg Ser 
145 150 155 160 

Gly Lys Ser Pro Asp Leu Val Ser Phe Val Gin Thr Leu Cys Lys Gly 
165 170 175 

Leu Ser Gin Pro Thr Thr Asn Leu Val Ala Gly Cys Leu Gin Leu Asn 
180 185 190 

Pro Arg Thr Phe Leu Pro Glu Gin Ser Gin Asp He Gin Ser His Met 
195 200 205 

Gin Thr Ala Ser Ser Ser Phe Pro Leu Gin Gly Tyr Pro Tyr Gin Ser 
210 215 220 

Pro Gly Leu Pro Ser Pro Pro Tyr Gly Thr Met Asp Ser Ser His Val 
225 230 235 240 

Phe His Val Lys Pro His Ser Tyr Gly Ala Ala Leu Glu Pro Phe Phe 
245 250 255 

Asp Ser Ser Thr Val Thr Glu Cys Thr Ser Pro Ser Phe Asp Gly Pro 
260 265 270 

Leu Ser Pro Pro Leu Ser Val Asn Gly Asn Phe Thr Phe Lys His Glu 
275 280 285 

His Ser Glu Tyr Asp Lys Asn Tyr Thr Phe Thr Met His Tyr Pro Ala 
290 295 300 

Ala Thr He Ser Gin Gly His Gly Pro Leu Phe Ser Thr Gly Gly Pro 
305 310 315 320 

Arg Cys Glu He Pro He Asp Thr He Met Ser Tyr Asp Gly His Ser 
325 330 335 

His His Glu Arg Val Met Ser Ala Gin Leu Asn Ala He Phe His Asp 
340 345 350 

(2) INFORMATION FOR SEQ ID NO: 5: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 7 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

(V) FRAGMENT TYPE: internal 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 5: 

Asn Ala Arg Glu Arg Arg Arg 
1 5 
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(2) INFORMATION FOR SEQ ID NO: 6: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 7 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

(v) FRAGMENT TYPE: internal 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 6: 

Asn Glu Arg Glu Arg Asn Arg 
1 5 

(2) INFORMATION FOR SEQ ID NO: 7: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 5 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

(v) FRAGMENT TYPE: internal 
(xi) SEQUENCE DESCRIPTION: SEQ ID N0:7: 

Asn Ala Arg Glu Arg 
1 5 

(2) INFORMATION FOR SEQ ID NO: 8: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 524 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 
(vii) IMMEDIATE SOURCE: 

(B) CLONE: 9F1 
(ix) FEATURE: 

(A) NAME/KEY: CDS 

(B) LOCATION: 57.. 524 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 8: 

TTTTTCTGCT TTTCTTTCTG TTTGCCTCTC CCTTGTTGAA TGTAGGAAAT CGAAAC 56 

ATG ACC AAA TCG TAC AGC GAG AGT GGG CTG ATG GGC GAG CCT CAG CCC 104 
Met Thr Lys Ser Tyr Ser Glu Ser Gly Leu Met Gly Glu Pro Gin Pro 
15 10 15 

CAA GGT CCT CCA AGC TGG ACA GAC GAG TGT CTC AGT TCT CAG GAC GAG 152 
Gin Gly Pro Pro Ser Trp Thr Asp Glu Cys Leu Ser Ser Gin Asp Glu 
20 25 30 

GAG CAC GAG GCA GAC AAG AAG GAG GAC GAC CTC GAA GCC ATG AAC GCA 200 
Glu His Glu Ala Asp Lys Lys Glu Asp Asp Leu Glu Ala Met Asn Ala 
35 40 45 

GAG GAG GAC TCA CTG AGG AAC GGG GGA GAG GAG GAG GAC GAA GAT GAG 24 8 

Glu Glu Asp Ser Leu Arg Asn Gly Gly Glu Glu Glu Asp Glu Asp Glu 
50 55 60 
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GAC CTG GAA GAG GAG GAA GAA GAG GAA GAG GAG GAT GAC GAT CAA AAG 296 
Asp Leu Glu Glu Glu Glu Giu Giu Glu Glu Glu Asp Asp Asp Gin Lys 
65 70 75 80 

CCC AAG AGA CGC GGC CCC AAA AAG AAG AAG ATG ACT AAG GCT CGC CTG 34 4 

Pro Lys Arg Arg Gly Pro Lys Lys Lys Lys Met Thr Lys Ala Arg Leu 
85 90 95 

GAG CGT TTT AAA TTG AGA CGC ATG AAG GCT AAC GCC CGG GAG CGG AAG 392 
Glu Arg Phe Lys Leu Arg Arg Met Lys Ala Asn Ala Arg Glu Arg Asn 
100 105 110 

CGC ATG CAC GGA CTG AAC GCG GCG CTA GAC AAC CTG CGC AAG GTG GTG 4 40 

Arg Met His Gly Leu Asn Ala Ala Leu Asp Asn Leu Arg Lys Val Vai 
115 120 , . 125 



CCT TGC TAT TCT AAG ACG CAG AAG CTG TCC AAA ATC GAG ACT CTG CGC 
Pro Cys Tyr Ser Lys Thr Gin Lys Leu Ser Lys lie Glu Thr Leu Arg 
130 135 140 



488 



TTG GCC AAG AAC TAG ATC TGG GCT CTG TCG GAG ATC 524 
Leu Ala Lys Asn Tyr lie Trp Ala Leu Ser Glu lie 
145 150 155 

(2) INFORMATION FOR SEQ ID NO: 9: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 156 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 9: 

Met Thr Lys Ser Tyr Ser Giu Ser Gly Leu Met Gly Glu Pro Gin Pro 
15 10 15 

Gin Gly Pro Pro Ser Trp Thr Asp Glu Cys Leu Ser Ser Gin Asp Giu 
20 25 30 

Glu His Giu Ala Asp Lys Lys Glu Asp Asp Leu Glu Ala Met Asn Ala 
35 40 45 

Glu Giu Asp Ser Leu Arg Asn Gly Gly Glu Glu Glu Asp Glu Asp Glu 
50 55 60 

Asp Leu Glu Glu Glu Glu Glu Glu Glu Glu Glu Asp Asp Asp Gin Lys 
65 70 75 80 

Pro Lys Arg Arg Gly Pro Lys Lys Lys Lys Met Thr Lys Ala Arg Leu 
85 90 95 

Glu Arg Phe Lys Leu Arg Arg Met Lys Ala Asn Ala Arg Giu Arg Asn 
100 105 110 

Arg Met His Gly Leu Asn Ala Ala Leu Asp Asn Leu Arg Lys Val Val 
115 120 125 



Pro Cys Tyr Ser Lys Thr Gin Lys Leu Ser Lys lie Glu Thr Leu Arg 
130 135 140 



wo 97/16548 



-61- 



PCT/US96/17532 



Leu Ala Lys Asn Tyr lie Trp Ala Leu Ser Glu lie 
145 150 155 

(2) INFORMATION FOR SEQ ID NO: 10: 
{i} SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1535 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 
(vii) IMMEDIATE SOURCE: 

(B) CLONE: 14B1 {neuroD2) 
(ix) FEATURE: 

(A) NAME/ KEY: CDS 

(B) LOCATION: 55.. 1194 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 10: 

CCCCTCACTT TGTGCTGTCT GTCTCCCCTT CCCGCCCGGG GNCCCTCAGG CACCATGCTG 60 

ACCCGCCTGT TCAGCGAGCC CGGCCTTCTC TCGGACGTGC CCAAGTTCGC CAGCTGGGGC 120 

GACGGCGAAG ACGACGAGCC GAGGAGCGAC AAGGGCGACG CGCCGCCACC GCCACCGCCT 18 0 

GCGCCCGGGC CAGGGGCTCC GGGGCCAGCC CGGGCGGCCA AGCCAGTCCC TCTCCGTGGA 24 0 

GAAGAGGGGA CGGAGGCCAC GTTGGCCGAG GTCAAGGAGG AAGGCGAGCT GGGGGGAGAG 300 

GAGGAGGAGG AAGAGGAGGA GGAAGAAGGA CTGGACGAGG CGGAGGGCGA GCGGCCCAAG 360 

AAGCGCGGGC CCAAGAAGCG CAAGATGACC AAGGCGCGCT TGGAGCGCTC CAAGCTTCGG 420 

CGGCAGAAGG CGAACGCGCG GGAGCGCAAC CGCATGCACG ACCTGAACGC AGCCCTGGAC 4 80 

AACCTGCGCA AGGTGGTGCC CTGCTACTCC AAGACGCAGA AGCTGTCCAft GATCGAGACG 54 0 

CTGCGCCTAG CCAAGAACTA TATCTGGGCG CTCTCGGAGA TCCTGCGCTC CGGCAAGCGG 600 

CCAGACCTAG TGTCCTACGT GCAGACTCTG TGCAAGGGTC TGTCGCAGCC CACCACCAAT 660 

CTGGTGGCCG GCTGTCTGCA GCTCAACTCT CGCAACTTCC TCACGGAGCA AGGCGCCGAC 720 

GGTGCCGGCC GCTTCCACGG CTCGGGCGGC CCGTTCGCCA TGCACCCCTA CCCGTACCCG 780 

TGCTCGCGCC TGGCGGGCGC ACAGTGCCAG GCGGCCGGCG GCCTGGGCGG CGGCGCGGCG 84 0 

CACGCCCTGC GGACCCACGG CTACTGCGCC GCCTACGAGA CGCTGTATGC GGCGGCAGGC 900 

GGTGGCGGCG CGAGCCCGGA CTACAACAGC TCCGAGTACG AGGGCCCGCT CAGCCCCCCG 960 

CTCTGTCTCA ATGGCAACTT CTCACTCAAG CAGGACTCCT CGCCCGACCA CGAGAAAAGC 1020 

TACCACTACT CTATGCACTA CTCGGCGCTG CCCGGTTCGC GCCACGGCCA CGGGCTAGTC 1080 



TTCGGCTCGT CGGCTGTGCG CGGGGGCGTC CACTCGGAGA ATCTCTTGTC TTACGATATG 



1140 
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CACCTTCACC 


ACGACCGGGG 


CCCCATGTAC 


GAGGAGCTCA 


ATGCGTTTTT 


TCATAACTGA 


1200 


GACTTCGCGC 


CGNCTCCCTN 


CTTTTTCTTT 


TGCCTTTGCC 


CGCCCCCCTG 


TCCCCAGCCC 


1260 


CCAGAGCGCA 


GGGACACCCC 


CATNCTACCC 


CGGCNCCGGC 


GGAGCGGGCC 


ACCGGTCTGC 


1320 


CGCTCTCCTG 


GGGCAGCGCA 


GTCTGTTACN 


TGTGGGTGGC 


TGTCCCAGGG 


GCCTCGCTTC 


1380 


CCCCAGGGAC 


TCGCCTTCTC 


TCTCCAAGGG 


GTTCCC7CCT 


CCTCTCTCCG 


AAGGAGTGCT 


1440 


TCTCCAGGGA 


CCTCTCTCCG 


GGGGCTCCCT 


GGAGGCACCC 


CTCCCCCATT 


CCCAATATCT 


1500 


TCGCTGAGGT 


TTCCTCCTCC 


CCCTCCTCCC 


TGCAG 






1535 


(2) INFORMATION FOR SEQ ID NO : 11 : 











(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 381mino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: protein 
(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 11: 

Met Leu Thr Arg Leu Phe Ser Glu Pro Gly Leu Leu Ser Asp Val Pro 
15 10 15 

Lys Phe Ala Ser Trp Gly Asp Gly Glu Asp Asp Glu Pro Arg Ser Asp 
20 25 30 

Lys Gly Asp Ala Pro Pro Pro Pro Pro Pro Ala Pro Gly Pro Gly Ala 
35 40 45 

Pro Gly Pro Ala Arg Ala Ala Lys Pro Val Pro Leu Arg Gly Glu Glu 
50 55 60 

Gly Thr Glu Ala Thr Leu Ala Glu Val Lys Glu Glu Gly Glu Leu Gly 
65 70 75 80 

Gly Glu Glu Glu Glu Glu Glu Glu Glu Glu Glu Gly Leu Asp Glu Ala 
85 90 95 

Glu Gly Glu Arg Pro Lys Lys Arg Gly Pro Lys Lys Arg Lys Met Thr 
100 105 110 

Lys Ala Arg Leu Glu Arg Ser Lys Leu Arg Arg Gin Lys Ala Asn Ala 
115 120 125 

Arg Glu Arg Asn Arg Met His Asp Leu Asn Ala Ala Leu Asp Asn Leu 
130 135 140 

Arg Lys Val Val Pro Cys Tyr Ser Lys Thr Gin Lys Leu Ser Lys lie 
145 150 155 160 

Glu Thr Leu Arg Leu Ala Lys Asn Tyr lie Trp Ala Leu Ser Glu He 
165 170 175 



wo 97/16548 



-63- 



PCT/US96/17532 



Leu Arg Ser Gly Lys Arg Pro Asp Leu Val Ser Tyr 
180 185 

Cys Lys Gly Leu Ser Gin Pro Thr Thr Asn Leu Val 
195 200 



Val Gin Thr Leu 
190 

Ala Gly Cys Leu 
205 



Gin Leu Asn Ser Arg Asn Phe Leu Thr Glu Gin Gly 
210 215 220 



Ala Asp Gly Ala 



Gly Arg Phe His Gly Ser Gly Gly Pro Phe 
225 230 

Tyr Pro Cys Ser Arg Leu Ala Gly Ala Gin 
245 250 



Ala Met 
235 



His Pro Tyr Pro 
240 



Cys Gin Ala Ala Gly Gly 
255 



Leu Gly Gly Gly Ala Ala His Ala Leu Arg 
260 265 



Thr His Gly Tyr Cys Ala 
270 



Ala Tyr Glu Thr Leu Tyr Ala Ala Ala Gly Gly Gly 
275 280 



Gly Ala Ser Pro 
285 



Asp Tyr Asn Ser Ser Glu Tyr Glu Gly Pro Leu Ser 
290 295 300 



Pro Pro Leu Cys 



Leu Asn Gly Asn Phe Ser Leu Lys Gin Asp 
305 310 

Lys Ser Tyr His Tyr Ser Met His Tyr Ser 

325 330 



Ser Ser Pro Asp His Glu 
315 320 

Ala Leu Pro Gly Ser Arg 
335 



His Gly His Gly Leu Val Phe Gly Ser Ser 
340 345 



Ala Val Arg Gly Gly Val 
350 



His Ser Glu Asn Leu Leu Ser Tyr Asp Met 
355 360 



His Leu His His Asp Arg 
365 



Gly Pro Met Tyr Glu Glu Leu Asn Ala Phe 
370 375 



Phe His 
380 



Asn 



INFORMATION FOR SEQ ID NO: 12: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1268 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 
(vii) IMMEDIATE SOURCE: 

(B) CLONE: 20A1 (neuroD3) 
(ix) FEATURE: 

{A} NAME/KEY: CDS 
(B) LOCATION: 55.. 768 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 12: 
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CTGCAGCGCT CTGAGCCGCT TTCTATCTGT CCGTCGGTCC TGCACAGCGC AACG ATG 57 

Met 
1 

CCA GCC CGC CTT GAG ACC TGC ATC TCC GAG CTC GAC TGC GCC AGC AGC 105 
Pro Ala Arg Leu Glu Thr Cys lie Ser Asp Leu Asp Cys Ala Ser Ser 
5 10 15 

AGC GGC AGT GAC CTA TCC GGC TTC CTC ACC GAC GAG GAA GAC TGT GCC 153 
Ser Gly Ser Asp Leu Ser Gly Phe Leu Thr Asp Glu Glu Asp Cys Ala 
20 25 30 

AGA CTC CAA CAG GCA GCC TCC GCT TCG GGG CCG CCC GCG CCG GCC CGC 201 
Arg Leu Gin Gin Ala Ala Ser Ala Ser Gly Pro Pro Ala Pro Ala Arg 
35 40 _ 45 

AGG GGC GCG CCC AAT ATC TCC CGG GCG TCT GAG GTT CCA GGG GCA CAG 24 9 

Arg Gly Ala Pro Asn lie Ser Arg Ala Ser Glu Val Pro Gly Ala Gin 
50 55 60 65 

GAC GAC GAG CAG GAG AGG CGG CGG CGC CGC GGC CGG ACG CGG GTC CGC 2 97 

Asp Asp Glu Gin Glu Arg Arg Arg Arg Arg Gly Arg Thr Arg Val Arg 
70 75 80 

TCC GAG GCG CTG CTG CAC TCG CTG CGC AGG AGC CGG CGC GTC AAG GCC 34 5 

Ser Glu Ala Leu Leu His Ser Leu Arg Arg Ser Arg Arg Val Lys Ala 
85 90 95 

AAC GAT CGC GAG CGC AAC CGC ATG CAC AAC TTG AAC GCG GCC CTG GAC 393 
Asn Asp Arg Glu Arg Asn Arg Met His Asn Leu Asn Ala Ala Leu Asp 
100 105 110 

GCA CTG CGC AGC GTG CTG CCC TCG TTC CCC GAC GAC ACC AAG CTC ACC 441 
Ala Leu Arg Ser Val Leu Pro Ser Phe Pro Asp Asp Thr Lys Leu Thr 
115 120 125 

AAA ATC GAG ACG CTG CGC TTC GCC TAC AAC TAG ATC TGG GCT CTG GCC 489 
Lys He Glu Thr Leu Arg Phe Ala Tyr Asn Tyr He Trp Ala Leu Ala 
130 135 140 145 

GAG ACA CTG CGC CTG GCG GAT CAA GGG CTG CCC GGA GGC GGT GCC CGG 537 
Glu Thr Leu Arg Leu Ala Asp Gin Gly Leu Pro Gly Gly Gly Ala Arg 
150 155 160 

GAG CGC CTC CTG CCG CCG CAG TGC GTC CCC TGC CTG CCC GGT CCC CCA 585 
Glu Arg Leu Leu Pro Pro Gin Cys Val Pro Cys Leu Pro Gly Pro Pro 
165 170 175 

AGC CCC GCC AGC GAC GCG GAG TCC TGG GGC TCA GGT GCC GCC GCC GCC 633 
Ser Pro Ala Ser Asp Ala Glu Ser Trp Gly Ser Gly Ala Ala Ala Ala 
180 185 190 



TCC CCG CTC TCT GAC CCC AGT AGC CCA GCC GCC TCC GAA GAC TTC ACC 
Ser Pro Leu Ser Asp Pro Ser Ser Pro Ala Ala Ser Glu Asp Phe Thr 
195 200 205 



681 
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TAG CGC CCC GGC GAG CCT GTT TTC TGG TTG CCA AGC CTG CCG AAA GAG 729 
Tyr Arg Pro Gly Asp Pro Val Phe Ser Phe Pro Ser Leu Pro Lys Asp 
210 215 220 225 

TTG CTG CAC ACA AGG CCC TGT TTC ATT GCT TAG CAC TAGGCCGTTT 77 5 

Leu Leu His Thr Thr Pro Gys Phe lie Pro Tyr His 





230 




235 








GTAGACACTG 


TTACTTTGCC 


CCTCCGCTAG 


TCAGGAGGCA 


ATAGATTGGG 


CCCAGCTGCC 


B35 


GCCTCGGGAC 


CCCTCTCCAG 


GGGGAGGGAG 


GAAGCGGGAG 


GTTTAAAGCA 


GTGGGGGATA 


895 


CCTGAGCCGC 


TTGTTAGGTC 


GCCGCACCCT 


CGCGGGGGAT 


GTCTCTTGGT 


CTGTTTGTCC 


955 


GGCCCTCAGC 


CCAGCGGGCC 


TGCTGGCCGC 


CCCTAGAGGG 


CCTTTCCTTT 


TGCAGTTTCT 


1015 


GAACTCCAGA 


AAACCTCGTT 


TGTGAGTGGC 


TCAGAACTGA 


CGCCAGCCAC 


CACTTCAGTG 


1075 


TGATTTAGAA 


AAGGGAGAGA 


TCAGCCCGTG 


AAGACGAGGT 


GAAAAGTCAA 


TTTTACAATT 


1135 


TGTAGAACTC 


TAATGAAGAA 


AAAGGAGCAT 


GAAAATTCGG 


TTTGAGGCGG 


GTGACAATAC 


1195 


AATGAAAAGG 


CTTAAAAAGG 


AGAGAGAAGG 


AGTGGGCTTC 


ATGCATTATG 


GATCCCGACG 


1255 


CCCACCACTG 


GAG 










1268 



(2) INFORMATION FOR SEQ ID NO: 13: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 237 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID N0:13: 

Met Pro Ala Arg Leu Glu Thr Gys lie Ser Asp Leu Asp Cys Ala Ser 
15 10 15 

Ser Ser Gly Ser Asp Leu Ser Gly Phe Leu Thr Asp Glu Glu Asp Cys 
20 25 30 

Ala Arg Leu Gin Gin Ala Ala Ser Ala Ser Gly Pro Pro Ala Pro Ala 
35 40 45 

Arg Arg Gly Ala Pro Asn lie Ser Arg Ala Ser Glu Val Pro Gly Ala 
50 55 60 

Gin Asp Asp Glu Gin Glu Arg Arg Arg Arg Arg Gly Arg Thr Arg Val 
65 70 75 80 

Arg Ser Glu Ala Leu Leu His Ser Leu Arg Arg Ser Arg Arg Val Lys 
85 90 95 

Ala Asn Asp Arg Glu Arg Asn Arg Met His Asn Leu Asn Ala Ala Leu 
100 105 110 

Asp Ala Leu Arg Ser Val Leu Pro Ser Phe Pro Asp Asp Thr Lys Leu 
115 120 125 
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Thr Lys He Glu Thr Leu Arg Phe ;Ua Tyr Asn Tyr He Trp Ala Leu 
130 135 140 

Ala Glu Thr Leu Arg Leu Ala Asp Gin Gly Leu Pro Gly Gly Gly Ala 
145 150 155 160 

Arg Glu Arg Leu Leu Pro Pro Gin Cys Val Pro Cys Leu Pro Gly Pro 
165 170 175 

Pro Ser Pro Ala Ser Asp Ala Glu Ser Trp Gly Ser Gly Ala Ala Ala 
180 185 190 

Ala Ser Pro Leu Ser Asp Pro Ser Ser Pro Ala Ala Ser Glu Asp Phe 
195 200 205 

Thr Tyr Arg Pro Gly Asp Pro Val Phe Ser Phe Pro Ser Leu Pro Lys 
210 215 220 

Asp Leu Leu His Thr Thr Pro Cys Phe He Pro Tyr His 
225 230 235 

(2) INFORMATION FOR SEQ ID NO: 14: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1560 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 
(ii) MOLECULE TYPE: cDNA 
(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 
(vii) IMMEDIATE SOURCE; 

(B) CLONE: HC2A 
(ix) FEATURE: 

(A) NAME/KEY: CDS 

(B) LOCATION: 57.. 1126 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 14: 

TTTTTCTGCT TTTCTTTCTG TTTGCCTCTC CCTTGTTGAA TGTAGGAAAT CGAAACATGA 60 

CCAAATCGTA CAGCGAGAGT GGGCTGATGG GCGAGCCTCA GCCCCAAGGT CCTCCAAGCT 120 

GGACAGACGA GTGTCTCAGT TCTCAGGACG AGGAGCACGA GGCAGACAAG AAGGAGGACG 180 

ACCTCGAAGC CATGAACGCA GAGGAGGACT CACTGAGGAA CGGGGGAGAG GAGGAGGACG 240 

AAGATGAGGA CCTGGAAGAG GAGGAAGAAG AGGAAGAGGA GGATGACGAT CAAAAGCCCA 300 

AGAGACGCGG CCCCAAAAAG AAGAAGATGA CTAAGGCTCG CCTGGAGCGT TTTAAATTGA 360 

GACGCATGAA GGCTAACGCC CGGGAGCGGA ACCGCATGCA CGGACTGAAC GCGGCGCTAG 420 

ACAACCTGCG CAAGGTGGTG CCTTGCTATT CTAAGACGCA GAAGCTGTCC AAAATCGAGA 4 80 

CTCTGCGCTT GGCCAAGAAC TACATCTGGG CTCTGTCGGA GATCCTGCGC TCAGGCAAAA 54 0 

GCCCAGACCT GGTCTCCTTC GTTCAGACGC TTTGCAAGGG CTTATCCCAA CCCACCACCA 600 

ACCTGGTTGC GGGCTGCCTG CAACTCAATC CTCGGACTTT TCTGCCTGAG CAGAACCAGG 660 



wo 97/16548 



-67- 



PCT/US96/17532 



ACATGCCCCC GCACCTGCCG ACGGCCAGCG CTTCCTTCCC TGTACACCCC TACTCCTACC 720 

AGTCGCCTGG GCTGCCCAGT CCGNCTTACG GTACCATGGA CAGCTCCCAT GTCTTCCACG 7 80 

TTAAGCCTCC GCCGCACGCC TACAGCGCAG CGCTGGAGCC CTTCTTTGAA AGCCCTCTGA 84 0 

CTGATTGCAC CAGCCCTTCC TTTGATGGAC CCCTCAGCCC GCCGCTCAGC ATCAATGGCA 900 

ACTTCTCTTT CAAACACGAA CCGTCCGCCG AGTTTGAGAA AAATTATGCC TTTACCATGC 960 

ACTATCCTGC AGCGACACTG GCAGGGGCCC AAAGCCACGG ATCAATCTTC TCAGGCACCG 1020 

CTGCCCCTCG CTGCGAGATC CCCATAGACA ATATTATGTC CTTCGATAGC CATTCACATC 1080 

ATGAGCGAGT CATGAGTGCC CAGCTCAATG CCATATTTCA TGATTAGAGG CACGCCAGTT 114 0 

TCACCATTTC CGGGAAACGA ACCCACTGTG CTTACAGTGA CTGTCGTGTT TACAAAAGGC 1200 

AGCCCTTTGG TACTACTGCT GCAAAGTGCA AATACTCCAA GCTTCAAGTG ATATATGTAT 1260 

TTATTGTCAT TACTGCCTTT GGAAGAAACA GGGGATCAAA GTTCCTGTTC ACCTTATGTA 1320 

TTATTTTCTA TAGACTCTTC TATTTTAAAA AATAAAAAAA TACAGTAAAG TTTAAAAAAT 1380 

ACACCACGAA TTTGGTGTGG CTGTATTCAG ATCGTATTAA TTATCTGATC GGGATAACAA 14 4 0 

AATCACAAGC AATAATTAGG ATCTATGCAA TTTTTAAACT AGTAATGGGC CAATTAAAAT 1500 

ATATATAAAT ATATATTTCA ACCAGCATTT TACTACTTGT TACCTCCCAT GCTGAATTAT 1560 

(2) INFORMATION FOR SEQ ID NO: 15: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 356 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 
(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 15: 

Met Thr Lys Ser Tyr Ser Glu Ser Gly Leu Met Gly Glu Pro Gin Pro 
15 10 15 

Gin Gly Pro Pro Ser Trp Thr Asp Glu Cys Leu Ser Ser Gin Asp Glu 
20 25 30 

Glu His Glu Ala Asp Lys Lys Glu Asp Asp Leu Glu Ala Met Asn Ala 
35 40 45 

Glu Glu Asp Ser Leu Arg Asn Gly Gly Glu Glu Glu Asp Glu Asp Glu 
50 55 60 

Asp Leu Glu Glu Glu Glu Glu Glu Glu Glu Glu Asp Asp Asp Gin Lys 
65 70 75 80 



Pro Lys Arg Arg Gly Pro Lys Lys Lys Lys Met Thr Lys Ala Arg Leu 
85 90 95 
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Glu Arg Phe Lys Leu Arg Arg Met Lys Ala Asn Ala Arg Giu Arg Asn 
100 105 110 

Arg Met His Gly Leu Asn Ala Ala Leu Asd Asn Leu Arg Lys Val Val 
115 120 ' 125 

Pro Cys Tyr Ser Lys Thr Gin Lys Leu Ser Lys He Glu Thr Leu Arg 
130 135 140 

Leu Ala Lys Asn Tyr He Trp Ala Leu Ser Glu He Leu Arg Ser Gly 
145 150 155 160 

Lys Ser Pro Asp Leu Val Ser Phe Val Gin Thr Leu Cys Lys Gly Leu 
165 ^170 175 

Ser Gin Pro Thr Thr Asn Leu Val Ala Gly Cys Leu Gin Leu Asn Pro 
180 185 190 

Arg Thr Phe Leu Pro Glu Gin Asn Gin Asp Met Pro Pro His Leu Pro 
195 200 205 

Thr Ala Ser Ala Ser Phe Pro Val His Pro Tyr Ser Tyr Gin Ser Pro 
210 215 220 

Gly Leu Pro Ser Pro Xaa Tyr Gly Thr Met Asp Ser Ser His Val Phe 
225 230 235 240 

His Val Lys Pro Pro Pro His Ala Tyr Ser Ala Ala Leu Glu Pro Phe 
245 250 255 

Phe Glu Ser Pro Leu Thr Asp Cys Thr Ser Pro Ser Phe Asp Gly Pro 
260 265 270 

Leu Ser Pro Pro Leu Ser He Asn Gly Asn Phe Ser Phe Lys His Glu 
275 280 2B5 

Pro Ser Ala Glu Phe Glu Lys Asn Tyr Ala Phe Thr Met His Tyr Pro 
290 295 300 

Ala Ala Thr Leu Ala Gly Ala Gin Ser His Gly Ser He Phe Ser Gly 
305 310 315 320 

Thr Ala Ala Pro Arg Cys Glu He Pro He Asp Asn He Met Ser Phe 
325 330 335 

Asp Ser His Ser His His Glu Arg Val Met Ser Ala Gin Leu Asn Ala 
340 345 350 



He Phe His Asp 
355 



INFORMATION FOR SEQ ID NO: 16: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1951 base pairs 

(B) TYPE: nucleic acid 
iC) STRANDEDNESS: single 
(D) TOPOLOGY: linear 
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(ii) MOLECULE TYPE: cDNA 
(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Mus mus cuius 
(vii) IMMEDIATE SOURCE: 

(B) CLONE: 1.1.1 (mouse neuroD2) 
(ix) FEATURE: 

(A) NAME/KEY: CDS 

(B) LOCATION: 230. . 1378 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 16: 

GAATTCAAGC TAGAGGCTGG TACCCCGCCT GGTAGAGATG CCACACTCGC TCCGCGGCTC 60 

GCATGGCGCT CTGAAGACGC CGGCGCCCGC GCCTTGAGGA ACCGCTGCCC CCGCTCCCTG 120 

AAGATGGGGG AACAATGAAA TAAGCGAGAA GATTCCTCTT CTCCCCCCTC TCTCTCTTGC 180 

CCCCTCCCCC CTCCCCTCCC CTCTCCCCTT GACTCCTCTC TGAGGCACCA TGCTGACCCG 240 

CCTGTTCAGC GAGCCCGGCC TCCTCTCGGA CGTGCCCAAG TTCGCCAGCT GGGGCGACGG 300 

CGACGACGAC GAGCCGAGGA GCGACAAGGG CGACGCGCCG CCGCAGCCTT CTCCTGCTCC 360 

CGGGTCGGGG GCTCCAGGAC CCGCCCGGGC CGCCAAGCCA GTGTCTCTTC GTGGAGGAGA 420 

AGAGATCCCT GAACCCACGT TGGCTGAGGT CAAGGAGGAA GGAGAGCTGG GCGGCGAGGA 48 0 

GGAGGAGGAA GAGGAGGAGG AGGAAGGACT GGACGAGGCG GAAGGCGAGC GGCCCAAGAA 54 0 

GCGCGGGCCG AAGAAACGCA AGATGACCAA GGCGCGTCTG GAGCGCTCCA AGCTGCGGCG 600 

ACAGAAGGCC AATGCGCGCG AGCGCAACCG CATGCACGAC CTGAACGCGG CTCTGGACAA 660 

CCTGCGCAAG GTGGTCCCCT GCTACTCCAA GACCCAGAAG CTGTCCAAGA TCGAGACCCT 720 

GCGCCTGGCC AAGAACTACA TCTGGGCTCT CTCGGAGATC TTGCGCTCCG GGAAGCGGCC 780 

GGATCTGGTG TCCTACGTGC AGACTCTGTG CAAGGGGCTG TCACAGCCCA CCACGAATCT 84 0 

GGTGGCCGGC TGCCTGCAGT TAAACTCTCG TAACTTCCTC ACGGAGCAGG GCGCGGACGG 900 

CGCCGGCCGC TTTCACGGCT CGGGTGGCCC GTTCGCCATG CATCCGTACC CATACCCGTG 960 

CTCCCGCCTG GCAGGCGCAC AGTGTCAGGC GGCTGGCGGC CTGGGCGGAG GCGCGGCGCA 1020 

CGCCCTGCGG ACCCACGGCT ACTGCGCCGC CTACGAGACG CTGTACGCGG CGGCCGGTGG 1080 

CGGCGGCGCT AGCCCGGACT ACAACAGCTC CGAGTACGAG GGTCCACTCA GTCCCCCGCT 114 0 

CTGTCTCAAC GGCAACTTCT CGCTCAAGCA GGACTCGTCC CCCGATCACG AGAAGAGCTA 120 0 

CCACTACTCT ATGCACTACT CGGCGCTGCC CGGCTCACGC CACGGCCACG GGCTGGTCTT 1260 

CGGCTCGTCG GCCGTGCGCG GGGGCGTCCA CTCCGAGAAT CTCTTGTCTT ACGATATGCA 1320 

CCTTCACCAC GATCGGGGCC CCATGTACGA GGAGCTCAAC GCATTTTTCC ATAACTGAGA 1380 

CCTCNCGCCG ACCCCTTCTT TTTCTTTGCC TTTGTCCGGC CCCTTAGCCC CAGCCCCANN 14 4 0 
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AGCTCAGGGA GCTCCCACCG ACTCCAGAGC CGGGCNCTCG NCNCGCCGCC GGTTCTGCAG 1500 

CTCTCCAGAG CGGCGTGCTC TCTTACCTGT GGGTGGCCCG TCCCAGGGGC CTCGCTTGCC 1560 

TCTGGGGACT CGCCTTCTCT CTCTCCCCAG CGGCTTCCTC CTCCCTTCTC TCGTGGAGAG 1620 

CATCTCTNNN GATCTCCCGC CAGCCCTCCC AAGAGACTTC CTCCACATTC CCAAACTTGG 1680 

GTTTTCTCTC CCCACCTCCA ACAGGCCAGA GGAGTTGGTA AGGGGTGCTG AGTCTCGGGA 1740 

TAGTGTCTCC CCACTTATAG TTACTTAAAC AAACAAACAG ACACAGAGCT TCCAGCNAAA 1800 

AGAGTTGGTA TCTCTTCCTT CTCGAAGANC ACCAGCCAGG AGCCCAACCG CCTTCACCCT 1860 

AACACNGAAT CTCCNNGTTT TTTATTTTTT ATTTTGGTGG GAGGGGATGT GGATTGAGAG 1920 

GAAAGAGAGA GCCAAGCCAA TTTGTAACTA G 19 5X 

(2) INFORMATION FOR SEQ ID NO: 17: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 382 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 
(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Mus musculus 
(vii) IMMEDIATE SOURCE: 

(B) CLONE: 1.1.1 (murine neuroD2) 
(xi) SEQUENCE DESCRIPTION: SEQ ID N0:17: 

Met Leu Thr Arg Leu Phe Ser Glu Pro Gly Leu Leu Ser Asp Val Pro 
15 10 15 

Lys Phe Ala Ser Trp Gly Asp Gly Asp Asp Asp Glu Pro Arg Ser Asp 
20 25 30 

Lys Gly Asp Ala Pro Pro Gin Pro Ser Pro Ala Pro Gly Ser Gly Ala 
35 40 45 

Pro Gly Pro Ala Arg Ala Ala Lys Pro Val Ser Leu Arq Gly Gly Glu 
50 55 60 

Glu He Pro Glu Pro Thr Leu Ala Giu Val Lys Glu Glu Gly Glu Leu 
65 70 75 80 

Gly Gly Glu Glu Glu Glu Glu Glu Glu Glu Glu Giu Gly Leu Asp Glu 
85 90 95 

Ala Glu Gly Glu Arg Pro Lys Lys Arg Gly Pro Lys Lys Arg Lys Met 
100 105 110 

Thr Lys Ala Arg Leu Glu Arg Ser Lys Leu Arg Arg Gin Lys Ala Asn 
115 120 125 

Ala Arg Glu Arg Asn Arg Met His Asp Leu Asn Ala Ala Leu Asp Asn 
130 135 140 
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Leu Arg Lys Val Val Pro Cys Tyr Ser Lys Thr Gin 
145 150 155 



Lys Leu Ser Lys 
160 



lie Glu Thr Leu Arg Leu Ala Lys Asn Tyr lie Trp 
165 170 



Ala Leu Ser Glu 
175 



lie Leu Arg Ser Gly Lys Arg Pro Asp Leu Val Ser 
180 185 



Tyr Val Gin Thr 
190 



Leu Cys Lys Gly Leu Ser Gin Pro Thr Thr Asn Leu 
195 200 



Val Ala Gly Cys 
205 



Leu Gin Leu Asn Ser Arg Asn Phe Leu Thr Glu Gin Gly Ala Asp Gly 
210 215 220 



Ala Gly Arg Phe His Gly Ser Gly Gly Pro Phe Ala 
225 230 235 



Met His Pro Tyr 
240 



Pro Tyr Pro Cys Ser Arg Leu Ala Gly Ala Gin Cys 
245 250 



Gin Ala Ala Gly 
255 



Gly Leu Gly Gly Gly Ala Ala His Ala Leu Arg Thr 
260 265 



His Gly Tyr Cys 
270 



Ala Ala Tyr Glu Thr Leu Tyr Ala Ala Ala Gly Gly 
275 280 



Gly Gly Ala Ser 
285 



Pro Asp Tyr Asn Ser Ser Glu Tyr Glu Gly Pro Leu 
29b 295 300 



Ser Pro Pro Leu 



Cys Leu Asn Gly Asn Phe Ser Leu Lys Gin Asp Ser 
305 310 315 



Ser Pro Asp His 
320 



Glu Lys Ser Tyr His Tyr Ser Met His Tyr Ser Ala 
325 330 



Leu Pro Gly Ser 
335 



Arg His Gly His Gly Leu Val Phe Gly Ser Ser Ala 
340 345 



Val Arg Gly Gly 
350 



Val His Ser Glu Asn Leu Leu Ser Tyr Asp Met His 
355 360 



Leu His His Asp 
365 



Arg Gly Pro Met Tyr Glu Glu Leu Asn Ala Phe Phe His Asn 
370 375 380 



(2) INFORMATION FOR SEQ ID NO: 18: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 

(B) TYPE: nucleic acid 

(C) STRAKDEDNESS: single 

(D) TOPOLOGY: linear 
(ii) MOLECULE TYPE: cDNA 

(vii) IMMEDIATE SOURCE: 
(B) CLONE: JL34 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 18: 



CTCAGCATCA GCAACTCGGC 



20 
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(2) INFORMATION FOR SEQ ID NO: 19: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 29 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
(ii) MOLECULE TYPE: cDNA 

(vii) IMMEDIATE SOURCE: 
(B) CLONE: JL36 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 19: 

TCGGATCCCG TTCTAGGCGC GCCTTGGTC 

(2) INFORMATION FOR SEQ ID NO: 20: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 21 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
(ii) MOLECULE TYPE: cDNA 

(vii) IMMEDIATE SOURCE: 
(B) CLONE: JL4 0 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:20: 

GTTTTCCCAG TCACGACGTT G 

(2) INFORMATION FOR SEQ ID NO: 21: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1333 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Mus musculus 
(vii) IMMEDIATE SOURCE: 

(B) CLONE: neuroD3 
(ix) FEATURE: 

(A) NAME/KEY: CDS 

(B) LOCATION: 101.. 835 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 21: 

CTGCAGAGGA CAGGTAGCCC CGGGTCGTAC GGACAGTAAG TGCGCTTCGA AGGCCGACCT 

CCAAACCTCC TGTCCGTCTG TCGGTCCTGC ACACTGCAAG ATG CCT GCC CCT TTG 

Met Pro Ala Pro Leu 
1 5 

GAG ACC TGC ATC TCT GAT CTC GAC TGC TCC AGC AGC AAC AGC AGC AGO 
Glu Thr Cys He Ser Asp Leu Asp Cys Ser Ser Ser Asn Ser Ser Ser 
10 15 20 

GAC CTG TCC AGC TTC CTC ACC GAC GAG GAG GAC TGT GCC AGG CTA CAG 
Asp Leu Ser Ser Phe Leu Thr Asp Glu Glu Asp Cys Ala Arg Leu Gin 
25 30 35 



115 



163 
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CCC CTA GCC TCC ACC TCG GGG CTG TCC GTG CCA GCC CGG AGG AGC GCT 259 

Pro Leu Ala Ser Thr Ser Gly Leu Ser Val Pro Ala Arg Arg Ser Ala 

40 45 50 

CCC GCC CTC TCC GGG GCA TCG AAT GTT CCC GGT GCC CAG GAC GAA GAG 307 

Pro Ala Leu Ser Gly Ala Ser Asn Val Pro Gly Ala Gin Asp Glu Glu 

55 60 65 

CAG GAA CGG CGG AGG CGG CGA GGT CGC GCT CGG GTG CGG TCC GAG GCT 355 

Gin Glu Arg Arg Arg Arg Arg Gly Arg Ala Arg Val Arg Ser Glu Ala 

70 75 80 85 

CTG CTG CAC TCC CTG CGG AGG AGT CGT CGC GTC AAA GCC AAC GAT CGC 4 03 

Leu Leu His Ser Leu Arg Arg Ser Arg Arg Val Lys Ala Asn Asp Arg 

90 95 100 

GAG CGC AAC CGC ATG CAC AAC CTC AAC GCT GCG CTG GAC GCC TTG CGC 451 

Glu Arg Asn Arg Met His Asn Leu Asn Ala Ala Leu Asp Ala Leu Arg 

105 110 115 

AGC GTG CTG CCC TCG TTC CCC GAC GAC ACC AAG CTC ACC AAG ATT GAG 4 99 

Ser Val Leu Pro Ser Phe Pro Asp Asp Thr Lys Leu Thr Lys lie Glu 

120 125 130 

ACG CTG CGC TTC GCC TAC AAC TAG ATC TGG GCC CTG GCT GAG ACA CTG 547 

Thr Leu Arg Phe Ala Tyr Asn Tyr He Trp Ala Leu Ala Glu Thr Leu 

135 140 145 

CGC CTG GCA GAT CAA GGG CTC CCC GGG GGC AGT GCC CGG GAG CGC CTC 595 

Arg Leu Ala Asp Gin Gly Leu Pro Gly Gly Ser Ala Arg Glu Arg Leu 

150 155 160 165 

CTG CCT CCG CAG TGT GTC CCC TGT CTG CCC GGG CCC CCG AGC CCG GCC 64 3 

Leu Pro Pro Gin Cys Val Pro Cys Leu Pro Gly Pro Pro Ser Pro Ala 

170 175 180 

AGC GAC ACT GAG TCC TGG GGT TCC GGG GCC GCT GCC TCC CCC TGC GCC 691 

Ser Asp Thr Glu Ser Trp Gly Ser Gly Ala* Ala Ala Ser Pro Cys Ala 

185 190 195 

ACT GTG GCA TCA CCA CTC TCT GAC CCC AGT AGT CCC TCG GCT TCA GAA 739 

Thr Val Ala Ser Pro Leu Ser Asp Pro Ser Ser Pro Ser Ala Ser Glu 

200 205 210 

GAC TTC ACC TAT GGC CCG GGC GAT CCC CTT TTC TCC TTT CCT GGC CTG 787 

Asp Phe Thr Tyr Gly Pro Gly Asp Pro Leu Phe Ser Phe Pro Gly Leu 

215 220 225 

CCC AAA GAC CTG CTC CAC ACG ACG CCC TGT TTC ATC CCA TAC CAC TAGGCCTTTG 
842 

Pro Lys Asp Leu Leu His Thr Thr Pro Cys Phe He Pro Tyr His 

230 235 240 245 

TAAGGCAACA TCAATACATT CTTCCTCCCC CAGTCTAAGA GCAATAATAG ATGGGGAACT 902 

GGCTGAAGCC TCCGGGGGCC ACACTTACCC CCAAGTGAAT TCTGGGAGCT TTAAAGGGGG 962 



GAGGGGGAAT ACCTGACCAC TTGTTAGGTT GCTGCACCCT CGCTGAAGCT GCCCTCGGTC 



1022 
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TATTTCTCCA CCCCCAGCAC GGCCTCCCCC CCCCCCGCCC GCCCCCAGAC GGCCTTTCGT 1082 

TTTTGTTGCA CTTTCTGAAC TTCACAAAAC CTTCTTTGTG ACTGGCTCAG AACTGACCCC 114 2 

AGCCACCACT TCAGTGTGGT TTGGAAAAGG GACAGATGAG CCCCTGAAGA CGAGGTGAAA 1202 

AGTCAATTTT ACAATTTGTA GAACTCTAAT GAAGAAAAAC GAGCATGAAA ATTCGGTTTG 1262 

AGCCGGCTGA CAATACAATG GCAAGGCTTA AAAAGGAGCC ACAAGGAGTG GGCTTCATGC 1322 

ATTATGGATC C 3^333 

(2) INFORMATION FOR SEQ ID NO:22: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 244 amino acids 

(B) TYPE: anvino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:22: 

Met Pro Ala Pro Leu Glu Thr Cys He Ser Asp Leu Asp Cys Ser Ser 
15 10 15 

Ser Asn Ser Ser Ser Asp Leu Ser Ser Phe Leu Thr Asp Glu Glu Asp 
20 25 30 

Cys Ala Arg Leu Gin Pro Leu Ala Ser Thr Ser Gly Leu Ser Val Pro 
35 40 45 

Ala Arg Arg Ser Ala Pro Ala Leu Ser Gly Ala Ser Asn Val Pro Gly 
SO 55 60 

Ala Gin Asp Glu Glu Gin Glu Arg Arg Arg Arg Arg Gly Arg Ala Arg 
6S 70 75 80 

Val Arg Ser Glu Ala Leu Leu His Ser Leu. Arg Arg Ser Arg Arg Val 
85 90 95 

Lys Ala Asn Asp Arg Glu Arg Asn Arg Met His Asn Leu Asn Ala Ala 
100 105 110 

Leu Asp Ala Leu Arg Ser Val Leu Pro Ser Phe Pro Asp Asp Thr Lys 
115 120 125 

Leu Thr Lys He Glu Thr Leu Arg Phe Ala Tyr Asn Tyr He Trp Ala 
130 135 140 

Leu Ala Glu Thr Leu Arg Leu Ala Asp Gin Gly Leu Pro Gly Gly Ser 
145 150 155 160 

Ala Arg Glu Arg Leu Leu Pro Pro Gin Cys Val Pro Cys Leu Pro Gly 
165 170 175 

Pro Pro Ser Pro Ala Ser Asp Thr Glu Ser Trp Gly Ser Gly Ala Ala 
180 185 190 
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Ala Ser Pro Cys Ala Thr Val Ala Ser Pro Leu Ser Asp Pro Ser Ser 
195 200 205 

Pro Ser Ala Ser Glu Asp Phe Thr Tyr Gly Pro Giy Asp Pro Leu Phe 
210 215 220 

Ser Phe Pro Gly Leu Pro Lys Asp Leu Leu His Thr Thr Pro Cys Phe 

225 230 235 240 



lie Pro Tyr His 
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The embodiments of the invention in which an exclusive property or privilege 
is claimed are defined as follows: 

1. An isolated polynucleotide molecule that encodes a neuroD 
polypeptide and that hybridizes under stringent conditions with a nucleic acid 
molecule selected fi-om among SEQ ID N0;1, SEQ ID N0:3, SEQ ID N0:8, SEQ ID 
NO:10, SEQ ED N0:12, SEQ ID N0;14, SEQ ID N0:16. SEQ ED N0:21, or its 
complement. 

2. The isolated polynucleotide molecule of Claim 1, said polynucleotide 
molecule encoding a human neuroD2 polypeptide, and further being capable of 
hybridizing under stringent conditions with the nucleotide sequence of SEQ ED 
NO; 10, or its complement. 

3. The isolated polynucleotide molecule of Claim 1, said polynucleotide 
molecule encoding a human neuroD3 polypeptide, and further being capable of 
hybridizing under stringent conditions with the nucleotide sequence of SEQ ID 
NO: 12, or its complement. 

4. An isolated polynucleotide molecule that comprises at least 
1 5 nucleotides and that hybridizes under stringent conditions with a neuroD HLH 
domain selected from among nucleotides 577-696 of SEQ ID N0:1. 
nucleotides 376-495 of SEQ ID N0:3, nucleotides 149-268 of SEQ DD N0:8, 
nucleotides 463-582 of SEQ ED N0:1C), nucleotides 368-496 of SEQ ID N0:12, 
nucleotides 405-524 of SEQ ID N0:14, nucleotides 642-761 of SEQ ID N0:16, 
nucleotides 425-544 of SEQ ID N0:21, or its complement. 

5. A vector comprising the following operably linked elements: a 
promoter, the polynucleotide molecule of Claim 1, and a transcription termination 
signal. 

6. A cell transformed by the polynucleotide molecule of Claim 1 . 

7. A recombinant peptide encoded by the polynucleotide molecule of 
Claim 1. 

8. An antibody or antigen-binding fragment thereof that binds to the 
recombinant peptide of Claim 9. 
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9. An antibody or antigen-binding fragment thereof that binds to a 
polypeptide selected from among SEQ ID N0:2, SEQ ID N0;4, SEQ ID N0;9, SEQ 
ID NO: 1 1. SEQ ID NO: 13, SEQ ID NO: 1 5, SEQ ID NO: 1 7, and SEQ ID NO:22. 

10. An antibody or antigen-binding fragment thereof that binds to a 
peptide selected from among amino acid residues 117-156 of SEQ ID N0:2, amino 
acid residues 118-157 of SEQ ID N0:4. amino acid residues 117-156 of SEQ ID 
NO:9. amino acid residues of 137-176 of SEQ ED N0:11, amino acid residues 108- 
147 of SEQ ID NO. 13, amino acid residues 1 17-156 of SEQ ID N0:I5, amino acid 
residues 138-177 of SEQ ID NO: 17, and amino acid residues 109-148 of SEQ ID 
NO:22. 
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